Skip to main content

Managing Worker Node Scaling with Karpenter

Objective

In this Lab, we will first explain the concepts of Karpenter, followed by hands-on exercises to try the following:

  • When Pod replicas are increased, necessary worker nodes are provisioned, and when Pod replicas are decreased, unnecessary nodes are deprovisioned
  • Worker node instances are cost-optimized through Consolidation, one of Karpenter's key features

Prerequisites

Initial Setup

Navigate to the root directory of the python-fastapi-demo-docker project where your environment variables are sourced:

cd ~/environment/python-fastapi-demo-docker

1. Understanding the Concepts

Karpenter is a flexible, high-performance open-source Kubernetes cluster autoscaler built by AWS. While the traditional Kubernetes Cluster Autoscaler (CAS) adjusted the number of worker nodes using EC2 Auto Scaling groups, Karpenter is characterized by its speed and flexibility as it directly launches appropriate EC2 instances as needed, rather than manipulating EC2 Auto Scaling groups. Note that EKS Auto Mode includes a fully managed Karpenter as one of its feature suites, so installation is not required.

Karpenter uses the following Custom Resource Definitions (CRDs) to configure its behavior:

  • NodePool
    This CRD allows you to configure what kind of nodes you want to launch.
    • Node requirements (architecture, OS, instance types, etc.)
    • Resource limits
    • Scaling behavior
    • Node lifecycle management
  • EC2NodeClass/NodeClass
    This CRD defines the node configuration itself and AWS-specific information. For example, it configures kubelet settings, roles, subnets, etc. In EKS Auto Mode, NodeClass is used instead of EC2NodeClass.
  • NodeClaim
    This CRD is created by Karpenter and is not created manually. It represents the request for nodes that will actually be launched. EC2 instances are launched based on this.

To learn more, see Concepts | Karpenter in Karpenter documentation.

2. Installing Karpenter

EKS Auto Mode does not require Karpenter installation. Additionally, since NodePools are pre-configured in EKS Auto Mode, you can use them without creating additional NodePools.

kubectl get NodePool
NAME NODECLASS NODES READY AGE
general-purpose default 1 True 21h
system default 1 True 21h

However, in this Lab, we will create and use a custom NodePool to more clearly understand the functionality. The custom ModePool manifest is eks/karpenter/nodepool.yaml.

Create with the following command:

kubectl apply -f eks/karpenter/nodepool.yaml

The expected output should look like this:

nodepool.karpenter.sh/developrs-workshop-pool created

Additionally, in this Lab, we will install and use eks-node-viewer to better visualize worker nodes. This tool makes it easy to monitor the status and utilization of worker nodes.

Install the eks-node-viewer:

brew tap aws/tap
brew install eks-node-viewer

The expected output should look like this:

==> Fetching downloads for: eks-node-viewer
==> Fetching aws/tap/eks-node-viewer
==> Downloading https://github.com/awslabs/eks-node-viewer/releases/download/v0.7.4/eks-node-viewer_Darwin_all
==> Downloading from https://release-assets.githubusercontent.com/github-production-release-asset/575555632/3c568f24-b0e6-42f6-9c73-f208a106f94f?sp=r&sv=2018-11-09&sr=b&spr=https&se=2025-08-12T22%3A13%3A41Z&rscd=attachment%3B+filename%3Deks-node-viewer_Darwin_all
######################################################################################################################################################################################################################################################### 100.0%
==> Installing eks-node-viewer from aws/tap
🍺 /opt/homebrew/Cellar/eks-node-viewer/0.7.4: 4 files, 136.0MB, built in 4 seconds
==> Running `brew cleanup eks-node-viewer`...
Disable this behaviour by setting `HOMEBREW_NO_INSTALL_CLEANUP=1`.
Hide these hints with `HOMEBREW_NO_ENV_HINTS=1` (see `man brew`).
==> No outdated dependents to upgrade!

If any issues arise, for troubleshooting purposes, see GitHub - awslabs/eks-node-viewer: EKS Node Viewer README.md.

3. Trying Scaling Up/Down

First, let's deploy an application for testing. We'll start by deploying the DB application Pod:

kubectl apply -f eks/deploy-db-python.yaml

The expected output should look like this:

service/db created
statefulset.apps/fastapi-postgres created

Next, let's deploy the Web application Pod:

kubectl apply -f eks/karpenter/deploy-app-python.yaml

The expected output should look like this:

deployment.apps/fastapi-deployment created

After deployment, verify that the Pods have started normally. Run the following command and confirm that the READY column matches:

kubectl get po -n my-cool-app -o wide

The expected output should look like this:

NAME                                  READY   STATUS    RESTARTS   AGE     IP               NODE                  NOMINATED NODE   READINESS GATES
fastapi-deployment-6f69d7cf44-9wxpr 1/1 Running 0 118s 192.168.65.192 i-0123456789abcdef0 <none> <none>
fastapi-postgres-0 1/1 Running 0 2m38s 192.168.115.80 i-0123456789abcdef1 <none> <none>

Next, use another terminal to launch eks-node-viewer:

eks-node-viewer --node-selector workload-type=developers-workshop

The expected output should look like this:

1 nodes (      200m/1780m) 11.2% cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ $0.042/hour | $30.740/month
4 pods (0 pending 4 running 4 bound)

i-0123456789abcdef0 cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% (1 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

You can confirm that one worker node is running with one Pod running on it.

Now, let's scale up the ReplicaSet of Deployment fastapi-deployment:

kubectl scale deploy -n my-cool-app fastapi-deployment --replicas=9

The expected output should look like this:

deployment.apps/fastapi-deployment scaled

After waiting a moment, check eks-node-viewer. As shown below, because one worker node ran out of capacity to start Pods, Karpenter provisions a new worker node. You can confirm that the Pods that couldn't be scheduled on the existing worker node are now running on the new worker node.

note

Please note that the actual placement of Pods on worker nodes may differ from the results shown below. Additionally, the following results are for EKS Auto Mode. In the case of Managed Node Groups, be aware that Pods from DaemonSets such as aws-node, ebs-csi-node, and kube-proxy will be present on the worker nodes.

2 nodes (     1800m/3560m) 50.6% cpu ████████████████████░░░░░░░░░░░░░░░░░░░░ $0.084/hour | $61.481/month
12 pods (0 pending 12 running 12 bound)

i-0123456789abcdef0 cpu ███████████████████████████████░░░░ 90% (8 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -
i-0123456789abcdef2 cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% (1 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

Conversely, let's scale down the Deployment's ReplicaSet with the following command:

kubectl scale deploy -n my-cool-app fastapi-deployment --replicas=1

The expected output should look like this:

deployment.apps/fastapi-deployment scaled

In eks-node-viewer, you can confirm that Karpenter has removed the Pods that were running on the old worker node.

2 nodes (      200m/3560m) 5.6% cpu ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ $0.084/hour | $61.481/month
4 pods (0 pending 4 running 4 bound)

i-0123456789abcdef0 cpu ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% (0 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -
i-0123456789abcdef2 cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% (1 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

After a while, you can confirm that Karpenter removes the empty worker node. This is because the empty worker node was disrupted by Karpenter's Consolidation feature.

1 nodes (      200m/1780m) 11.2% cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ $0.042/hour | $30.740/month
4 pods (0 pending 4 running 4 bound)

i-0123456789abcdef2 cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% (1 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

4. Trying Consolidation Feature

Consolidation is one of many useful features of Karpenter. In this Lab, we're using the same values as Karpenter's defaults, where Karpenter automatically performs worker node deletion or replacement when nodes are empty or underutilized. In this section, we'll also verify the behavior when utilization is low.

  disruption:
consolidationPolicy: WhenEmptyOrUnderutilized
consolidateAfter: 0s

Karpenter will delete a worker node if all Pods can run using the available capacity of other worker nodes. Additionally, it will replace nodes if all Pods can run using a combination of available capacity on other worker nodes and one lower-cost alternative worker node.

First, let's scale up:

kubectl scale deploy -n my-cool-app fastapi-deployment --replicas=9

The expected output should look like this:

deployment.apps/fastapi-deployment scaled

In eks-node-viewer, you can confirm that Karpenter has provisioned new worker nodes as shown below.

2 nodes (     1800m/3560m) 50.6% cpu ████████████████████░░░░░░░░░░░░░░░░░░░░ $0.084/hour | $61.481/month
12 pods (0 pending 12 running 12 bound)

i-0123456789abcdef2 cpu ███████████████████████████████░░░░ 90% (8 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -
i-0123456789abcdef3 cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% (1 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

Next, let's decrease the number of replicas by one:

kubectl scale deploy -n my-cool-app fastapi-deployment --replicas=8

The expected output should look like this:

deployment.apps/fastapi-deployment scaled

As a result, one of the worker nodes now has enough available capacity to run all 8 Pods:

2 nodes (     1600m/3560m) 44.9% cpu ██████████████████░░░░░░░░░░░░░░░░░░░░░░ $0.084/hour | $61.481/month
11 pods (0 pending 11 running 11 bound)

i-0123456789abcdef2 cpu ████████████████████████████░░░░░░░ 79% (7 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -
i-0123456789abcdef3 cpu ████░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 11% (1 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

Consequently, Karpenter determines that one worker node is unnecessary and disrupts it.

2 nodes (     1600m/3560m) 44.9% cpu ██████████████████░░░░░░░░░░░░░░░░░░░░░░ $0.084/hour | $61.481/month
11 pods (0 pending 11 running 11 bound)

i-0123456789abcdef2 cpu ███████████████████████████████░░░░ 90% (8 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -
i-0123456789abcdef3 cpu ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░ 0% (0 pods) t3a.medium/$0.0421 On-Demand/Auto Deleting Ready -
1 nodes (     1600m/1780m) 89.9% cpu ████████████████████████████████████░░░░ $0.042/hour | $30.740/month
11 pods (0 pending 11 running 11 bound)

i-0123456789abcdef2 cpu ███████████████████████████████░░░░ 90% (8 pods) t3a.medium/$0.0421 On-Demand/Auto - Ready -

To learn more, see Disruption | Karpenter in Karpenter documentation.

5. Clean Up Resources

To clean up all resources created in this lab exercise and the workshop up to this point, run the following commands.

cd /home/ec2-user/environment/python-fastapi-demo-docker
kubectl delete -f eks/deploy-db-python.yaml
kubectl delete -f eks/karpenter/deploy-app-python.yaml
kubectl delete -f eks/karpenter/nodepool.yaml

Conclusion

In this lab, we first understood the concepts of Karpenter, and then practically tested scaling up and down. The scale-down behavior was achieved through Consolidation, one of Karpenter's features, which we examined in more detail. Additionally, Karpenter can perform autoscaling when used in conjunction with the Horizontal Pod Autoscaler.