Skip to content

add options to tune concurrency, qps and burst#287

Open
ajaysundark wants to merge 2 commits into
kubernetes-sigs:mainfrom
ajaysundark:runtime-tuning
Open

add options to tune concurrency, qps and burst#287
ajaysundark wants to merge 2 commits into
kubernetes-sigs:mainfrom
ajaysundark:runtime-tuning

Conversation

@ajaysundark

Copy link
Copy Markdown
Contributor

Description

Add flags to the controller to tune controller-runtime knobs for scalability

Related Issue

None

Type of Change

/kind feature

Testing

Manual test runs

Checklist

  • make test passes
  • make lint passes

Does this PR introduce a user-facing change?

Yes, adds command-flags to tune concurrency, qps and burst -- https://cluster-api.sigs.k8s.io/developer/core/tuning#runtime-tuning-options

Relevant for our ongoing scalability work.

add new commandline-flags for tuning concurrency, qps and burst limits

@kubernetes-prow kubernetes-prow Bot added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 3, 2026
@netlify

netlify Bot commented Jul 3, 2026

Copy link
Copy Markdown

Deploy Preview for node-readiness-controller canceled.

Name Link
🔨 Latest commit ba22b7f
🔍 Latest deploy log https://app.netlify.com/projects/node-readiness-controller/deploys/6a476f597125d400070f2a25

@kubernetes-prow

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ajaysundark

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kubernetes-prow kubernetes-prow Bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jul 3, 2026
@kubernetes-prow kubernetes-prow Bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 3, 2026
Comment thread cmd/main.go
var metricsCertDir string
var leaderElectionNamespace string
var enableNodeStateMetrics bool
var kubeAPIQPS float64

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that these vars are increasing, I think we can keep them in vars() block outside main for better code organization?

Comment thread cmd/main.go
"Maximum queries per second to the API server from this client. "+
"Raise together with --kube-api-burst on large clusters.")
flag.IntVar(&kubeAPIBurst, "kube-api-burst", 30,
"Maximum burst for throttle between requests to the API server.")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this makes it simpler "maximum number of queries that should be allowed in one burst"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any specific reason for keeping 20 and 30 or is it just to get started.

@ajaysundark ajaysundark Jul 3, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the current defaults set in controller runtime. I think we could adjust this after our experiments. Maybe I could add a comment to clarify these random numbers here

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, That would help, I looked at the default and saw its 5 and 10 https://github.com/kubernetes/client-go/blob/f16383b964b3519812bac4daf8f48fc5a529ae0f/rest/config.go#L118-L127, I am I looking at wrong place?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm.. I referred kcm default config here: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager.

looks like the controller-runtime had these hardcoded as defaults until recently: kubernetes-sigs/controller-runtime@ab40409

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the magic numbers, could we have them as constants?

Comment thread cmd/main.go

mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
restConfig := ctrl.GetConfigOrDie()
restConfig.QPS = float32(kubeAPIQPS)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we declare it directly as float32?

@vitorfloriano

vitorfloriano commented Jul 4, 2026

Copy link
Copy Markdown
Contributor

I took this branch for a spin and verified that the tuning works.

I ran a manual test with a 1000-node kwok cluster without any parameters and NRC added all the taints in about 2m55s:

image

The same manual test with --node-concurrent-reconciles 20 --kube-api-burst 1000 --kube-api-qps 50** is performed in 1m21s (about half the time w/o any tuning):

image

I'm still learning how to query Prometheus, but I think I got this right.

**Just picked these numbers randomly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants