Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: Use built-in priority classes for critical EKS-A components and limit the consumption of these classes for normal workload by default #8701

Open
Cajga opened this issue Sep 3, 2024 · 0 comments

Comments

@Cajga
Copy link
Contributor

Cajga commented Sep 3, 2024

What would you like to be added:
By default, Kubernetes ships with two priority classes which is also true for EKS-A:

$ kubectl get priorityclasses.scheduling.k8s.io -A
NAME                      VALUE        GLOBAL-DEFAULT   AGE
system-cluster-critical   2000000000   false            29d
system-node-critical      2000001000   false            29d

These should be used for critical components/add-ons of the cluster to make sure that these components are not evicted first in case of a resource pressure.

Currently, EKS-A does not use these priority classes for all of it's infra components:

$ kubectl get pods -n kube-system kube-vip-kls107 -o yaml|grep priority
  priority: 0
$ kubectl get pods -n eksa-system eksa-controller-manager-6bb5cb4fb4-lx8q5 -o yaml|grep -i priori
  preemptionPolicy: PreemptLowerPriority
  priority: 0
$ kubectl get pods -n eksa-system tink-controller-5d876c7b44-z55h8 -o yaml|grep -i priori
  preemptionPolicy: PreemptLowerPriority
  priority: 0
$ kubectl get pods -n eksa-system kube-vip-pm5cx -o yaml|grep -i priori
  preemptionPolicy: PreemptLowerPriority
  priority: 0

IMO, it would be a stability improvement to make sure that all critical EKS-A components are using one of these priority classes.

On top of this, with the current config, any workload on EKS-A could start to use these built-in priority classes which could result into a situation where EKS-A components are not able to run anymore as K8s scheduler would favor normal workloads in case of a resource pressure. Kubernetes provides a way to prevents this through AdmissionConfiguration. It would be advisable to implement this limitation for normal workload for the built-in classes.

Why is this needed:
To improve cluster stability of the cluster in case of resource pressure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant