aws · burhan · Nov 3, 2025
@@ -123,7 +123,8 @@ spec:
 
 For workloads that might not be interruptible e.g. long running batch jobs without checkpointing, consider annotating pods with the `do-not-evict` annotation. By opting pods out of eviction, you are telling Karpenter that it should not voluntarily remove nodes containing this pod. However, if a `do-not-evict` pod is added to a node while the node is draining, the remaining pods will still evict, but that pod will block termination until it is removed. In either case, the node will be cordoned to prevent additional work from being scheduled on the node. Below is an example showing how set the annotation:
 
-```yaml hl_lines="8"
+[,yaml]
+----
 apiVersion: v1
 kind: Pod
 metadata:
@@ -139,15 +140,16 @@ spec:
 image: nginx
 ports:
  ** containerPort: 80
-```
+----
 
 === Remove under-utilized nodes by adjusting Cluster Autoscaler parameters
 
 Node utilization is defined as the sum of requested resources divided by capacity. By default `scale-down-utilization-threshold` is set to 50%. This parameter can be used along with and `scale-down-unneeded-time`, which determines how long a node should be unneeded before it is eligible for scale down -- the default is 10 minutes. Pods still running on a node that was scaled down will get scheduled on other nodes by kube-scheduler.  Adjusting these settings can help remove nodes that are underutilized, but it's important you test these values first so you don't force the cluster to scale down prematurely.
 
 You can prevent scale down from happening by ensuring that pods that are expensive to evict are protected by a label recognized by the Cluster Autoscaler. To do this, ensure that pods that are expensive to evict have the annotation `cluster-autoscaler.kubernetes.io/safe-to-evict=false`. Below is an example yaml to set the annotation:
 
-```yaml hl_lines="8"
+[,yaml]
+----
 apiVersion: v1
 kind: Pod
 metadata:
@@ -163,7 +165,7 @@ spec:
 image: nginx
 ports:
  ** containerPort: 80
-```
+----
 
 === Tag nodes with Cluster Autoscaler and Karpenter
 
@@ -310,6 +312,3 @@ Some GPU hardware can be shared across multiple workloads so a single GPU can be
 
 * https://aws.amazon.com/blogs/containers/gpu-sharing-on-amazon-eks-with-nvidia-time-slicing-and-accelerated-ec2-instances/[GPU sharing on Amazon EKS with NVIDIA time-slicing and accelerated EC2 instances]
 * https://aws.amazon.com/blogs/containers/maximizing-gpu-utilization-with-nvidias-multi-instance-gpu-mig-on-amazon-eks-running-more-pods-per-gpu-for-enhanced-performance/[Maximizing GPU utilization with NVIDIA's Multi-Instance GPU (MIG) on Amazon EKS: Running more pods per GPU for enhanced performance]
-
-
-