Add performance benchmark for the CA RunOnce control loop by Choraden · Pull Request #9237 · kubernetes/autoscaler

Choraden · 2026-02-16T09:45:07Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

While working on #9022, it became clear that a standardized benchmark is necessary to quantify performance gains and prevent potential regressions in the core logic.

Leveraging the ongoing refactor of the autoscaler building logic in #9099, this PR introduces an initial draft of a benchmark specifically for the RunOnce function. This provides a controlled environment to measure the impact of architectural changes on the main execution loop.

Initial benchmark version at #9199 was difficult to stabilize and reason about. So we decided to simplify it to only one RunOnce call, simulating "cold start" of the CA.

Which issue(s) this PR fixes:

Relates to #9022

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2026-02-16T09:45:17Z

Hi @Choraden. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Choraden · 2026-02-16T09:47:58Z

/uncc aleksandra-malinowska vadasambar
/cc @x13n @pmendelski
/assign @towca @mtrqq

Keeping as draft until #9099 is merged.

k8s-ci-robot · 2026-02-16T09:48:04Z

@Choraden: GitHub didn't allow me to request PR reviews from the following users: pmendelski.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/uncc aleksandra-malinowska vadasambar
/cc @x13n @pmendelski
/assign @towca @mtrqq

Keeping as draft until #9099 is merged.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Choraden · 2026-02-16T09:52:39Z

Sharing results:

goos: linux
goarch: amd64
pkg: k8s.io/autoscaler/cluster-autoscaler/core/bench
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkRunOnceScaleUp
BenchmarkRunOnceScaleUp-8              1        1553364377 ns/op        762335776 B/op   8386725 allocs/op
BenchmarkRunOnceScaleUp-8              1        1585365778 ns/op        761671552 B/op   8379717 allocs/op
BenchmarkRunOnceScaleUp-8              1        1679504327 ns/op        765269968 B/op   8417926 allocs/op
BenchmarkRunOnceScaleUp-8              1        1747838191 ns/op        768367568 B/op   8450747 allocs/op
BenchmarkRunOnceScaleUp-8              1        1807256894 ns/op        762046144 B/op   8383723 allocs/op
BenchmarkRunOnceScaleUp-8              1        1894646431 ns/op        763635504 B/op   8400150 allocs/op
BenchmarkRunOnceScaleUp-8              1        2029809964 ns/op        763436992 B/op   8397823 allocs/op
BenchmarkRunOnceScaleUp-8              1        1647624633 ns/op        763700192 B/op   8401286 allocs/op
BenchmarkRunOnceScaleUp-8              1        1779511222 ns/op        763037232 B/op   8393860 allocs/op
BenchmarkRunOnceScaleUp-8              1        1600256004 ns/op        761429888 B/op   8376822 allocs/op
BenchmarkRunOnceScaleUp-8              1        1640637519 ns/op        765901664 B/op   8424453 allocs/op
BenchmarkRunOnceScaleUp-8              1        1718404806 ns/op        760736208 B/op   8369882 allocs/op
BenchmarkRunOnceScaleUp-8              1        1602827127 ns/op        761467056 B/op   8377073 allocs/op
BenchmarkRunOnceScaleUp-8              1        1621891375 ns/op        766729728 B/op   8433466 allocs/op
BenchmarkRunOnceScaleUp-8              1        1669716609 ns/op        764976512 B/op   8415161 allocs/op
BenchmarkRunOnceScaleUp-8              1        1692255431 ns/op        762403168 B/op   8387177 allocs/op
BenchmarkRunOnceScaleUp-8              1        1556904841 ns/op        764092048 B/op   8405116 allocs/op
BenchmarkRunOnceScaleUp-8              1        1754510304 ns/op        765413872 B/op   8419820 allocs/op
BenchmarkRunOnceScaleUp-8              1        1618392504 ns/op        764394560 B/op   8408599 allocs/op
BenchmarkRunOnceScaleUp-8              1        1727217439 ns/op        763309312 B/op   8397060 allocs/op
BenchmarkRunOnceScaleDown
BenchmarkRunOnceScaleDown-8            1        1929963694 ns/op        1007442848 B/op 10141579 allocs/op
BenchmarkRunOnceScaleDown-8            1        2037815272 ns/op        1007731760 B/op 10144702 allocs/op
BenchmarkRunOnceScaleDown-8            1        1955088108 ns/op        1008686160 B/op 10154822 allocs/op
BenchmarkRunOnceScaleDown-8            1        2007510803 ns/op        1007923488 B/op 10147363 allocs/op
BenchmarkRunOnceScaleDown-8            1        2071893121 ns/op        1007467424 B/op 10142303 allocs/op
BenchmarkRunOnceScaleDown-8            1        1989004058 ns/op        1007225888 B/op 10139238 allocs/op
BenchmarkRunOnceScaleDown-8            1        2158107429 ns/op        1007981264 B/op 10147202 allocs/op
BenchmarkRunOnceScaleDown-8            1        2175420048 ns/op        1007327520 B/op 10140612 allocs/op
BenchmarkRunOnceScaleDown-8            1        2256805385 ns/op        1006870784 B/op 10135439 allocs/op
BenchmarkRunOnceScaleDown-8            1        2101632209 ns/op        1008600128 B/op 10154222 allocs/op
BenchmarkRunOnceScaleDown-8            1        2121799323 ns/op        1005974256 B/op 10125445 allocs/op
BenchmarkRunOnceScaleDown-8            1        2370849015 ns/op        1006922272 B/op 10138672 allocs/op
BenchmarkRunOnceScaleDown-8            1        2243496561 ns/op        1007922880 B/op 10147037 allocs/op
BenchmarkRunOnceScaleDown-8            1        2117320600 ns/op        1007739056 B/op 10144560 allocs/op
BenchmarkRunOnceScaleDown-8            1        2239399297 ns/op        1007212624 B/op 10139170 allocs/op
BenchmarkRunOnceScaleDown-8            1        2125659572 ns/op        1007527936 B/op 10142351 allocs/op
BenchmarkRunOnceScaleDown-8            1        2091775344 ns/op        1007627408 B/op 10143524 allocs/op
BenchmarkRunOnceScaleDown-8            1        2129201106 ns/op        1007008608 B/op 10136800 allocs/op
BenchmarkRunOnceScaleDown-8            1        2048429965 ns/op        1006513392 B/op 10131373 allocs/op
BenchmarkRunOnceScaleDown-8            1        2462875021 ns/op        1008418272 B/op 10152107 allocs/op
---
                   │ master.txt │
                   │   sec/op   │
RunOnceScaleUp-8     1.675 ± 4%
RunOnceScaleDown-8   2.120 ± 3%
geomean              1.884

                   │  master.txt  │
                   │     B/op     │
RunOnceScaleUp-8     728.2Mi ± 0%
RunOnceScaleDown-8   960.8Mi ± 0%
geomean              836.4Mi

                   │ master.txt  │
                   │  allocs/op  │
RunOnceScaleUp-8     8.399M ± 0%
RunOnceScaleDown-8   10.14M ± 0%
geomean              9.230M

Scale Up:

Scale Down:

k8s-ci-robot · 2026-02-23T07:58:54Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Choraden
Once this PR has been reviewed and has the lgtm label, please ask for approval from towca. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

towca · 2026-03-10T16:41:29Z

cluster-autoscaler/cloudprovider/test/fake_cloud_provider.go

 }

+// WithMinSize sets the minimum size of the node group.
+func WithMinSize(min int) NodeGroupOption {


nit: A single WithSizeConfig(min, max int) option makes more sense to me, I think you'd always want to set both ends of the limits if you're setting them at all (unless you want to rely on the default being 0, but it seems better to force callers to be explicit).

towca · 2026-03-10T17:03:18Z

cluster-autoscaler/builder/autoscaler.go

 }

+// WithAutoscalingKubeClients allows injecting autoscaling kube clients.
+func (b *AutoscalerBuilder) WithAutoscalingKubeClients(kubeClients *cacontext.AutoscalingKubeClients) *AutoscalerBuilder {


AutoscalingKubeClients is normally built based on the KubeClient and InformerFactory, both of which are already injectable (and as far I can see they need to be injected). Why do we need to make AutoscalingKubeClients directly injectable too? It seems like we'd always want to have the AutoscalingKubeClients field in sync with the other two. Otherwise we'd get different behavior for components that use the client/informers directly than for components that use AutoscalingKubeClients.

Ah okay, after reading some more I see why - we want to change the log recorder. Could you add a warning to the method comment that this is not needed for most use-cases, and if you're using it it has to be in sync with the objects provided in WithKubeClient() and WithInformerFactory()?

Added a comment to the method.

towca · 2026-03-10T17:21:39Z