Skip to content

cli: add SNP ID block annotations to Pods based on CPU requirements#2214

Open
daniel-weisse wants to merge 6 commits intomainfrom
dw/cli-id-block-generation
Open

cli: add SNP ID block annotations to Pods based on CPU requirements#2214
daniel-weisse wants to merge 6 commits intomainfrom
dw/cli-id-block-generation

Conversation

@daniel-weisse
Copy link
Member

  • Update reference value generation to create SNP reference values for up to 8 vCPUs
    • This can be adjusted at will, but since each CPU variation results in one more entry in the generated manifest, larger numbers will immensely blow up the size of the manifest
  • Update ID block generation to calculate ID blocks for up to 8 vCPUs
  • Embed ID block mappings the the CLI and annotate Pods during contrast generate with the ID blocks required for the requested CPU amount

@daniel-weisse daniel-weisse added the changelog PRs that should be part of the release notes label Feb 26, 2026
@daniel-weisse daniel-weisse force-pushed the dw/cli-id-block-generation branch 7 times, most recently from 2287b59 to 6de728d Compare March 3, 2026 14:23
@daniel-weisse daniel-weisse marked this pull request as ready for review March 4, 2026 08:48
@daniel-weisse daniel-weisse force-pushed the dw/cli-id-block-generation branch 3 times, most recently from 3acb557 to 6fd606e Compare March 5, 2026 14:21
@daniel-weisse daniel-weisse requested a review from charludo March 9, 2026 14:35
@daniel-weisse daniel-weisse force-pushed the dw/cli-id-block-generation branch 2 times, most recently from 0304e90 to f63b942 Compare March 9, 2026 14:48
@burgerdev burgerdev self-assigned this Mar 10, 2026
@daniel-weisse daniel-weisse force-pushed the dw/cli-id-block-generation branch from f63b942 to 59f9a72 Compare March 10, 2026 10:11
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
Signed-off-by: Daniel Weiße <dw@edgeless.systems>
@daniel-weisse daniel-weisse force-pushed the dw/cli-id-block-generation branch from 59f9a72 to cf6cec9 Compare March 16, 2026 14:17
Copy link
Member

@msanft msanft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will work, but I'm a little unsure about the current interface we expose to the user.

podLevelCPU := getCPUCount(spec.Resources)

// Convert milliCPUs to number of CPUs (rounding up), and add 1 for hypervisor overhead
totalMilliCPUs := max(regularContainersCPU, initContainersCPU, podLevelCPU)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this matches the user's expectations, or what's done by non-Kata Kubernetes here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think may be unexpected about this formula? I pointed @daniel-weisse to #2272 for where it comes from.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thing I was wary about is the round-up. With cgroups and CPU slices, this isn't something to worry about. But when a user shifts some YAML that worked in his non-Contrast deployment to Contrast, we may try to use more CPUs than physically available due to this. I don't think this is something that would be a realistic scenario, though. LMK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood, thanks. We'll need to document this in https://docs.edgeless.systems/contrast/howto/workload-deployment/deployment-file-preparation#pod-resources before we consider this feature done, yes. I don't see what we could do to not round up, though, since fractional CPUs don't make sense for VMs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scheduler considerations might become interesting, though: I don't think there's a way to tell k8s via runtimeClass to round up the limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a concrete idea on how to proceed with this? I don't see what we could do either.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just document it, recommeding only integral CPU counts. If rounding does not change the number, there are no problems with unexpected counts or scheduling. But if the user decides to go against that recommendation, this code still does the right thing.

]
++ [
"panic=1"
"nr_cpus=1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just set this to the maximum value of 8 statically?
As per the docs, this is:

Maximum number of processors that an SMP kernel could support

We could also just omit this, as this number can also be resolved dynamically: https://elixir.bootlin.com/linux/v6.19.8/source/kernel/cpu.c#L3153-L3166

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there's a ticket for removing the param / Markus mentioned this as well yesterday. Thomas at one point suggested setting this to 32.

8 seems a bit low. I'm not super sure if the memory usage increase from removing the limit is a noticeable problem. I'm voting "remove".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I read in the docs, this setting is only relevant for VMs that might have CPUs hot-plugged during runtime, which is not the case for us.

Copy link
Collaborator

@charludo charludo Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but apparently setting this to a constant < the Kernel max will save some memory 🤷🏼‍♀️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is only relevant for memory pre-reservation. Perhaps also for hot-plugging, which we don't need to support anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

        nr_cpus=        [SMP] Maximum number of processors that an SMP kernel
                        could support.  nr_cpus=n : n >= 1 limits the kernel to
                        support 'n' processors. It could be larger than the
                        number of already plugged CPU during bootup, later in
                        runtime you can physically add extra cpu until it reaches
                        n. So during boot up some boot time memory for per-cpu
                        variables need be pre-allocated for later physical cpu
                        hot plugging.

I read this as "the kernel does the right thing for the number of CPUs plugged at boot; if you plan on plugging more later, set this to the number you're aiming for."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that is also how I understood this, with the caveat that not setting this defaults to the kconfig value of CONFIG_NR_CPUS, which in our case is currently 240. Hence the increased memory pre-reservation if the line is removed altogether.

Not arguing against removing this though, as I said above, I'm for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read this as "the kernel does the right thing for the number of CPUs plugged at boot; if you plan on plugging more later, set this to the number you're aiming for."

"n >= 1 limits the kernel to support 'n' processors". If you don't set it, you will be able to hot-plug up to CONFIG_NR_CPUS as Charlotte said. And this requires reservation of some memory.

Since we already limit this to 240 (I didn't know that when I commented 32 in the ticket), it should be okay.

msanft added 2 commits March 24, 2026 14:49
Not specifying `nr_cpus` on the command line
costs us marginal amounts of memory while saving
complexity in the TDX RTMR pre-calculation.
By dropping this from the command line, we make the
kernel fall back to the `CONFIG_NR_CPUS=240`
kconfig variable.
@msanft
Copy link
Member

msanft commented Mar 24, 2026

@burgerdev, @charludo; Addressed my own feedback, PTAL.

@msanft msanft requested review from burgerdev and charludo March 24, 2026 13:57
Copy link
Collaborator

@charludo charludo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixup changes LGTM; have not looked into the still-open conversation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog PRs that should be part of the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants