Skip to content

Add performance benchmark for the CA RunOnce control loop#9237

Open
Choraden wants to merge 9 commits intokubernetes:masterfrom
Choraden:run_once_bench_v2
Open

Add performance benchmark for the CA RunOnce control loop#9237
Choraden wants to merge 9 commits intokubernetes:masterfrom
Choraden:run_once_bench_v2

Conversation

@Choraden
Copy link
Contributor

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

While working on #9022, it became clear that a standardized benchmark is necessary to quantify performance gains and prevent potential regressions in the core logic.

Leveraging the ongoing refactor of the autoscaler building logic in #9099, this PR introduces an initial draft of a benchmark specifically for the RunOnce function. This provides a controlled environment to measure the impact of architectural changes on the main execution loop.

Initial benchmark version at #9199 was difficult to stabilize and reason about. So we decided to simplify it to only one RunOnce call, simulating "cold start" of the CA.

Which issue(s) this PR fixes:

Relates to #9022

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-area needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Feb 16, 2026
@k8s-ci-robot
Copy link
Contributor

Hi @Choraden. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed do-not-merge/needs-area labels Feb 16, 2026
@Choraden
Copy link
Contributor Author

/uncc aleksandra-malinowska vadasambar
/cc @x13n @pmendelski
/assign @towca @mtrqq

Keeping as draft until #9099 is merged.

@k8s-ci-robot k8s-ci-robot requested review from x13n and removed request for aleksandra-malinowska and vadasambar February 16, 2026 09:48
@k8s-ci-robot
Copy link
Contributor

@Choraden: GitHub didn't allow me to request PR reviews from the following users: pmendelski.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/uncc aleksandra-malinowska vadasambar
/cc @x13n @pmendelski
/assign @towca @mtrqq

Keeping as draft until #9099 is merged.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@Choraden
Copy link
Contributor Author

Sharing results:

goos: linux
goarch: amd64
pkg: k8s.io/autoscaler/cluster-autoscaler/core/bench
cpu: Intel(R) Xeon(R) CPU @ 2.20GHz
BenchmarkRunOnceScaleUp
BenchmarkRunOnceScaleUp-8              1        1553364377 ns/op        762335776 B/op   8386725 allocs/op
BenchmarkRunOnceScaleUp-8              1        1585365778 ns/op        761671552 B/op   8379717 allocs/op
BenchmarkRunOnceScaleUp-8              1        1679504327 ns/op        765269968 B/op   8417926 allocs/op
BenchmarkRunOnceScaleUp-8              1        1747838191 ns/op        768367568 B/op   8450747 allocs/op
BenchmarkRunOnceScaleUp-8              1        1807256894 ns/op        762046144 B/op   8383723 allocs/op
BenchmarkRunOnceScaleUp-8              1        1894646431 ns/op        763635504 B/op   8400150 allocs/op
BenchmarkRunOnceScaleUp-8              1        2029809964 ns/op        763436992 B/op   8397823 allocs/op
BenchmarkRunOnceScaleUp-8              1        1647624633 ns/op        763700192 B/op   8401286 allocs/op
BenchmarkRunOnceScaleUp-8              1        1779511222 ns/op        763037232 B/op   8393860 allocs/op
BenchmarkRunOnceScaleUp-8              1        1600256004 ns/op        761429888 B/op   8376822 allocs/op
BenchmarkRunOnceScaleUp-8              1        1640637519 ns/op        765901664 B/op   8424453 allocs/op
BenchmarkRunOnceScaleUp-8              1        1718404806 ns/op        760736208 B/op   8369882 allocs/op
BenchmarkRunOnceScaleUp-8              1        1602827127 ns/op        761467056 B/op   8377073 allocs/op
BenchmarkRunOnceScaleUp-8              1        1621891375 ns/op        766729728 B/op   8433466 allocs/op
BenchmarkRunOnceScaleUp-8              1        1669716609 ns/op        764976512 B/op   8415161 allocs/op
BenchmarkRunOnceScaleUp-8              1        1692255431 ns/op        762403168 B/op   8387177 allocs/op
BenchmarkRunOnceScaleUp-8              1        1556904841 ns/op        764092048 B/op   8405116 allocs/op
BenchmarkRunOnceScaleUp-8              1        1754510304 ns/op        765413872 B/op   8419820 allocs/op
BenchmarkRunOnceScaleUp-8              1        1618392504 ns/op        764394560 B/op   8408599 allocs/op
BenchmarkRunOnceScaleUp-8              1        1727217439 ns/op        763309312 B/op   8397060 allocs/op
BenchmarkRunOnceScaleDown
BenchmarkRunOnceScaleDown-8            1        1929963694 ns/op        1007442848 B/op 10141579 allocs/op
BenchmarkRunOnceScaleDown-8            1        2037815272 ns/op        1007731760 B/op 10144702 allocs/op
BenchmarkRunOnceScaleDown-8            1        1955088108 ns/op        1008686160 B/op 10154822 allocs/op
BenchmarkRunOnceScaleDown-8            1        2007510803 ns/op        1007923488 B/op 10147363 allocs/op
BenchmarkRunOnceScaleDown-8            1        2071893121 ns/op        1007467424 B/op 10142303 allocs/op
BenchmarkRunOnceScaleDown-8            1        1989004058 ns/op        1007225888 B/op 10139238 allocs/op
BenchmarkRunOnceScaleDown-8            1        2158107429 ns/op        1007981264 B/op 10147202 allocs/op
BenchmarkRunOnceScaleDown-8            1        2175420048 ns/op        1007327520 B/op 10140612 allocs/op
BenchmarkRunOnceScaleDown-8            1        2256805385 ns/op        1006870784 B/op 10135439 allocs/op
BenchmarkRunOnceScaleDown-8            1        2101632209 ns/op        1008600128 B/op 10154222 allocs/op
BenchmarkRunOnceScaleDown-8            1        2121799323 ns/op        1005974256 B/op 10125445 allocs/op
BenchmarkRunOnceScaleDown-8            1        2370849015 ns/op        1006922272 B/op 10138672 allocs/op
BenchmarkRunOnceScaleDown-8            1        2243496561 ns/op        1007922880 B/op 10147037 allocs/op
BenchmarkRunOnceScaleDown-8            1        2117320600 ns/op        1007739056 B/op 10144560 allocs/op
BenchmarkRunOnceScaleDown-8            1        2239399297 ns/op        1007212624 B/op 10139170 allocs/op
BenchmarkRunOnceScaleDown-8            1        2125659572 ns/op        1007527936 B/op 10142351 allocs/op
BenchmarkRunOnceScaleDown-8            1        2091775344 ns/op        1007627408 B/op 10143524 allocs/op
BenchmarkRunOnceScaleDown-8            1        2129201106 ns/op        1007008608 B/op 10136800 allocs/op
BenchmarkRunOnceScaleDown-8            1        2048429965 ns/op        1006513392 B/op 10131373 allocs/op
BenchmarkRunOnceScaleDown-8            1        2462875021 ns/op        1008418272 B/op 10152107 allocs/op
---
                   │ master.txt │
                   │   sec/op   │
RunOnceScaleUp-8     1.675 ± 4%
RunOnceScaleDown-8   2.120 ± 3%
geomean              1.884

                   │  master.txt  │
                   │     B/op     │
RunOnceScaleUp-8     728.2Mi ± 0%
RunOnceScaleDown-8   960.8Mi ± 0%
geomean              836.4Mi

                   │ master.txt  │
                   │  allocs/op  │
RunOnceScaleUp-8     8.399M ± 0%
RunOnceScaleDown-8   10.14M ± 0%
geomean              9.230M

Scale Up:
image

Scale Down:
image

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 17, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Choraden
Once this PR has been reviewed and has the lgtm label, please ask for approval from towca. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 23, 2026
@Choraden Choraden marked this pull request as ready for review February 23, 2026 08:01
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 23, 2026
}

// WithMinSize sets the minimum size of the node group.
func WithMinSize(min int) NodeGroupOption {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: A single WithSizeConfig(min, max int) option makes more sense to me, I think you'd always want to set both ends of the limits if you're setting them at all (unless you want to rely on the default being 0, but it seems better to force callers to be explicit).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

// WithAutoscalingKubeClients allows injecting autoscaling kube clients.
func (b *AutoscalerBuilder) WithAutoscalingKubeClients(kubeClients *cacontext.AutoscalingKubeClients) *AutoscalerBuilder {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AutoscalingKubeClients is normally built based on the KubeClient and InformerFactory, both of which are already injectable (and as far I can see they need to be injected). Why do we need to make AutoscalingKubeClients directly injectable too? It seems like we'd always want to have the AutoscalingKubeClients field in sync with the other two. Otherwise we'd get different behavior for components that use the client/informers directly than for components that use AutoscalingKubeClients.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay, after reading some more I see why - we want to change the log recorder. Could you add a warning to the method comment that this is not needed for most use-cases, and if you're using it it has to be in sync with the objects provided in WithKubeClient() and WithInformerFactory()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment to the method.

klog.SetOutput(io.Discard)
ctrl.SetLogger(klog.Background())

// Disable automatic Garbage Collection during the timed portion of the benchmark
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did the results of the benchmark differ without this part? If we have a performance regression caused by frequent and quickly discarded memory allocations, it seems like GC would be a considerable portion of the regression. It seems that disabling GC would mask such regressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hard part about benchmarking GC'ed apps is that you have little control over when the GC happens. That makes the results unstable and with high variance. As the comments say, I decided to disable the GC to stabilize the benchmark. Since it also measures allocations, it should be fine. It might be underestimating GC pressure sometimes, but that's the price for results stability.

I also added -gc flag to control that behavior e.g. when someone wants to profile the benchmark with gc.


// run executes the benchmark for a given scenario. It handles environment stabilization,
// profiling, and repeated execution of the RunOnce loop.
func run(b *testing.B, s scenario) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think run() being a method on scenario would be more idiomatic in Go.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

runtime.GC()

if f != nil && i == 0 {
if err := pprof.StartCPUProfile(f); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you evaluated having the profile capture the sum of all RunOnce executions instead of just the first loop? It'd be nice to have the reasoning for only capturing one loop in a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark usually runs only 1 loop due to it's heavy load. I decided to keep it this way for simplicity.


// fastTaintingKubeClient tracks nodes that were artificially tainted in-memory to bypass
// the overhead of full API server round-trips and complex object tracking in the fake client.
type fastTaintingKubeClient struct {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the difference in results between this and the normal tainting logic?

I'm surprised that fake K8s client logic (and the fake CloudProvider logic for that matter) would be significant enough to impact the overall results. There's a fair bit of complexity in this benchmark to deal with that, I'm wondering if we can avoid it somehow (e.g. grab the profile from all loops to address variance; increase the parameters of the scenario so that binpacking is more expensive and dominates the whole loop more).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fake k8s takes suspiciously much space in the cpu profile:

image

I'm not sure, if it correctly represents the actual update logic, which would be just an API call. Instead, it tries to simulate the outer world.

Moreover, it introduces some non-deterministic behaviour in the allocations:

                    │ bench_no_fast_tainting.txt │
                    │            B/op            │
RunOnceScaleDown-24                1.239Gi ± 19%

                    │ bench_no_fast_tainting.txt │
                    │         allocs/op          │
RunOnceScaleDown-24                  11.15M ± 8%

testprovider.WithMinSize(0),
)

if _, err := cluster.KubeClient.CoreV1().Namespaces().Create(context.Background(), &corev1.Namespace{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Which part of RunOnce() depends on this? You normally get the default namespace automatically, so IMO it'd make sense to have this centralized as part of the BUT framework somehow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must have been my mistake. It's not needed. I removed that part completely.

cpu := int64(nodeCPU / podsPerNode)
mem := int64(nodeMem / podsPerNode)
pod := BuildTestPod(podName, cpu, mem, MarkUnschedulable())
if _, err := cluster.KubeClient.CoreV1().Pods("default").Create(context.Background(), pod, metav1.CreateOptions{}); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the BUT framework gives you a more convenient way to manipulate the Pods and the Nodes at least, see https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/test/integration/synctest/template_test.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced with K8s.AddPod from the framework.


func BenchmarkRunOnceScaleUp(b *testing.B) {
s := scenario{
setup: setupScaleUp(200),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

200 Nodes * 50 PodsPerNode is not a very complex scenario. Maybe this is why the actual bottleneck logic isn't dominating the whole result and we need to make other parts more synthetic?

Have you evaluated higher values here? How did the results differ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have run the benchmark for 1000 Nodes * 100 Pods (as nodes are usually limited to run ~100 pods) and other scenarios, but the results and profiles did not have any new findings. I decided to keep the scenario minimal enough to enable benchmark presubmit job.

// This is for benchmarks that will only evaluate target size and
// want to avoid the overhead (and CPU profile noise) of adding
// nodes to the internal state of the fake cloud provider.
func (n *NodeGroup) NoOpIncreaseSize(delta int) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, it's pretty ugly to have an exported method specific to just a single benchmark here in the common CloudProvider fake. I don't see a good way around it though... If we're keeping it, WDYT about changing it slightly to SetTargetSize()/SetTargetSizeOnly()? In this form it seems like it could be useful for other tests more than in the current one.

@mtrqq @GaetanoMar96 @kawych Maybe you have a better idea here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found existing DecreaseTargetSize method that only modifies target size. It can be used as NoOpIncreaseSize by calling it with negative delta. We should be fine with it for now and it eliminates the need to extend test package interface.

Updated MustCreateManager in the integration test package to accept testing.TB
instead of *testing.T. This allows the helper to be used within both standard
tests and performance benchmarks (which use *testing.B).

This change is a prerequisite for introducing performance benchmarking for
the RunOnce control loop.
@Choraden Choraden force-pushed the run_once_bench_v2 branch from ce198a3 to abe3907 Compare March 13, 2026 13:02
@Choraden
Copy link
Contributor Author

Rebased on master.

This commit adds a new benchmarking suite in core/bench to evaluate the
performance of the primary Cluster Autoscaler control loop (RunOnce). These
benchmarks simulate large-scale cluster operations using a mock Kubernetes API
and cloud provider, allowing for comparative analysis and detection of
performance regressions.
Introduced a -profile-cpu flag to the RunOnce benchmarking suite. When
specified, the benchmark will capture a CPU profile during the first execution
of the RunOnce loop and write it to the provided file path.
Disable Garbage Collection during RunOnce benchmarks to ensure stable and
reproducible results. This prioritizes consistency over absolute performance
metrics, allowing for a generic way to calculate performance differences
between patches and providing a clean CPU profile for the RunOnce loop.
Introduce a no-op event recorder in RunOnce benchmarks to prevent event
dropping and potential performance side-effects. This change also
extends AutoscalerBuilder to support injecting custom AutoscalingKubeClients,
allowing for better control over the environment in performance-sensitive tests.
Introduce fastScaleUpCloudProvider and fastScaleUpNodeGroup in benchmarks
to avoid the overhead of simulating real node creation in the fake cloud
provider. This significantly reduces noise in CPU profiles when benchmarking
the core autoscaling logic, as it avoids unnecessary node object management
in the fake provider.

Added NoOpIncreaseSize to the fake NodeGroup to support this faster
scale-up simulation.
This change introduces fastTaintingKubeClient which uses reactors to
track and inject ToBeDeleted taints on nodes during the benchmark. This
allows the scale-down logic to correctly identify nodes that have been
marked for deletion by the autoscaler without relying on standard fake
client persistence for these taints.
This simply removes fake client from cpu profile.
@Choraden Choraden force-pushed the run_once_bench_v2 branch from abe3907 to 2c85ed6 Compare March 13, 2026 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants