feat(install): defer LUKS encryption to first boot#2096
feat(install): defer LUKS encryption to first boot#2096andrewdunndev wants to merge 2 commits intobootc-dev:mainfrom
Conversation
Replace the install-time cryptsetup/cryptenroll calls in the Tpm2Luks path with a first-boot encryption approach. At install time, the root filesystem is created 32MB smaller than the partition and a rd.bootc.luks.encrypt=tpm2 karg is added. On first boot, a dracut module runs cryptsetup reencrypt --encrypt to encrypt the root in-place, then enrolls TPM2 on real hardware with correct firmware state. This eliminates both the IPC namespace semaphore deadlock (bootc-dev#2089) and the shim/PCR mismatch problem (bootc-dev#421). Prior art: openSUSE disk-encryption-tool, which ships a production implementation of first-boot encryption using the same cryptsetup reencrypt --encrypt mechanism. Fixes: bootc-dev#2089 Related: bootc-dev#421, bootc-dev#476, bootc-dev#477 Signed-off-by: Andrew Dunn <andrew@dunn.dev> AI-Assisted: yes AI-Tools: GitLab Duo, OpenCode
There was a problem hiding this comment.
Code Review
This pull request introduces a new first-boot LUKS encryption mechanism for bootc. It defers the actual encryption and TPM2 enrollment of the root partition from installation time to the first boot, addressing issues with TPM2 enrollment in the install environment. This is achieved by reserving space for the LUKS header during installation, adding a new bootc-luks-firstboot.sh script and systemd service to the initrd, and updating the dracut module to include necessary components. Review comments highlight a critical flaw in the idempotency logic of the first-boot encryption script, which could leave the system unbootable if interrupted, and suggest a more robust method for parsing kernel command-line arguments.
Fix critical idempotency issue: if the script was interrupted after encrypt_root but before configure_system, the next boot would see the device as already LUKS and exit without updating BLS entries, leaving the system unbootable. Now configure_system always runs when the karg is present, regardless of encryption state. Also switch cmdline parsing from word-splitting to array-based approach (read -r -a) for robustness against arguments with spaces. Signed-off-by: Andrew Dunn <andrew@dunn.dev>
cgwalters
left a comment
There was a problem hiding this comment.
Right now bootc is generally more of a mechanism that tries to be flexible and not hardcode much policy. There's a lot of opinions in this code and things that people would want to configure that I think are just out of scope. For example, waiting to allow the user to capture the recovery key.
Some use cases will want a lot more than just --tpm2-device=auto like PCR configuration.
Almost all load-bearing code in bootc is in Rust, and I have had very painful experiences debugging complex bash code in the initramfs and would like to not ever do that again.
I think ultimately this type of logic really does need to live in the operating system side. systemd has done a great job in shipping configurable tools for this LUKS setup. It was my mistake in wrapping one small subset of this in tpm2-luks for bootc.
Also, this whole topic very strongly intersects with composefs DPS support, and I would like to try to preserve that. I have some PoC patches in that respect.
We will try to work on this problem domain, but it's security related and needs some care. At least for me, a near term focus will be on using Ignition for this, because it's already supported there. If your use case can e.g. derive from Fedora CoreOS and use Ignition, that would likely help.
That said also AFAICS if you are using to-disk - the main thing that would be helpful to do here in bootc is leave an opinionated gap at install time - nothing blocks you from shipping this bash code in an initramfs in your derived image today right?
|
Thanks for the detailed response. This is helpful context. A few things: On mechanism vs policy: Understood. The script is opinionated and the right place for this logic is in the image, not in bootc. We'll ship the firstboot encryption as a custom dracut module in our derived image. On composefs + DPS: We're actively evaluating composefs mode 2 for our estate (Fedora 42 + systemd-boot + ZFS + NVIDIA). We'd be interested in the DPS/repart direction for disk encryption -- is the idea that On Ignition: Our setup is bare metal homelab, not CoreOS-derived. We use pyinfra for provisioning. But the pattern of "install plain, encrypt on first boot" is what we're after regardless of the provisioning tool. On shipping it ourselves: Yes, nothing blocks us from including the dracut module in our image today. We'll do that. If the repart/DPS direction matures to where bootc can hand off disk encryption to systemd-repart, we'd happily switch to that. We'll close this PR. The DM_DISABLE_UDEV fix in #2090 is separate and still relevant for the existing tpm2-luks path. |
Summary
Defer LUKS encryption to first boot instead of running cryptsetup inside the install container. This eliminates the IPC namespace semaphore deadlock (#2089) and the shim/PCR mismatch problem (#421) in one change, since TPM2 binding happens on real hardware with real firmware state.
Prior art: openSUSE disk-encryption-tool, which ships a production implementation of first-boot encryption using the same
cryptsetup reencrypt --encryptmechanism.Problem
bootc install to-disk --block-setup tpm2-luksrunscryptsetup luksFormat,systemd-cryptenroll, andcryptsetup luksOpeninside the install container. This has two problems:IPC namespace deadlock (install to-disk --block-setup tpm2-luks hangs: libdevmapper udev cookie semaphore deadlock in container IPC namespace #2089): libdevmapper uses SysV semaphores to coordinate with udevd. Inside a container with an isolated IPC namespace, udevd on the host cannot see the container's semaphores, causing
luksOpenandluksCloseto hang onsemop().Shim/PCR mismatch (install to-disk with LUKS + TPM broken #421): TPM2 enrollment during install binds to the container's firmware state, not the installed system's firmware. On first real boot, the PCR values differ and auto-unlock fails.
Approach
Install time: Write an unencrypted root partition with the filesystem created 32MB smaller than the partition. Add
rd.bootc.luks.encrypt=tpm2to the kernel command line. No cryptsetup calls, no devmapper, no TPM2.First boot: A dracut module (
51bootc) installs a systemd service that runs beforesysroot.mount. The service callscryptsetup reencrypt --encrypt --reduce-device-size 32Mto encrypt the root partition in-place using the reserved 32MB for the LUKS2 header. It then enrolls TPM2 viasystemd-cryptenroll, writes/etc/crypttab, and updates the BLS entry withrd.luks.uuid/rd.luks.namekargs.The
root=UUID=<ext4-uuid>karg does not change. Oncesystemd-cryptsetupunlocks LUKS on subsequent boots, the ext4 UUID inside becomes visible androot=resolves normally.Changes
crates/lib/src/install/baseline.rsTpm2Luksarm: remove all cryptsetup/cryptenroll callsmkfs_with_reserve()that creates the filesystem smaller than the partition (ext4: block count arg, XFS:-d size=, btrfs:-b)rd.bootc.luks.encrypt=tpm2karg instead ofluks.uuidluks_device = None(no luksClose needed)crates/initramfs/dracut/module-setup.shbootc-luks-firstboot.shand.serviceinto the initramfsdm_cryptkernel modulesysroot.mount.requiressymlink for service orderingcrates/initramfs/luks-firstboot/bootc-luks-firstboot.sh(new)rd.bootc.luks.encryptkarg from/proc/cmdlinecryptsetup isLuksbefore encryptingcryptsetup reencrypt --encrypt --reduce-device-size 32Msystemd-cryptenroll --tpm2-device=autosystemd-cryptenroll --recovery-key/etc/crypttaband BLS entriescrates/initramfs/bootc-luks-firstboot.service(new)Before=sysroot.mountin the initrdConditionKernelCommandLine=rd.bootc.luks.encryptOnFailure=emergency.target(drops to shell on failure)Makefile/usr/lib/bootc/Testing
Full end-to-end validation on GCP n2-standard-8 with nested KVM, Fedora 42 (cryptsetup 2.8.4, systemd 257.11, QEMU 9.2.4 + OVMF + swtpm).
Build verification
cargo checkcargo build --releasecargo clippy -p bootc-libInstall verification
Installed with patched bootc binary via
bootc install to-disk --block-setup tpm2-luks --filesystem ext4to a 20GB disk.rd.bootc.luks.encrypt=tpm2in BLS entryEncryption verification
Manually encrypted the installed root partition using the same
cryptsetup reencrypt --encrypt --reduce-device-size 32Mcommand the first-boot script uses.e2fsckclean after encryptioncrypttabwritten to ostree deployrd.luks.uuidBoot verification
Booted the encrypted system in QEMU with swtpm (vTPM 2.0). The initramfs was patched to include the first-boot dracut module with
systemd-cryptsetupsupport.systemd-cryptsetup@cr_root.servicestarted/dev/mapper/cr_rootdevice createdsysroot.mountsucceeded (ostree root)Serial console output (key lines):
Encryption mechanism validation (independent)
Tested
cryptsetup reencrypt --encrypt --reduce-device-size 32Mindependently across multiple filesystem types and sizes.Additional mechanism tests:
cryptsetup isLuksdetects existing LUKS,reencrypt --encryptrejects already-encrypted devices ("Device is already LUKS device. Aborting.")Not tested (requires project CI)
Fixes: #2089
Related: #421, #476, #477
Signed-off-by: Andrew Dunn andrew@dunn.dev
AI-Assisted: yes
AI-Tools: GitLab Duo, OpenCode
AI-Generated Content Disclosure: This PR contains code generated with assistance from GitLab Duo and OpenCode. The output has been reviewed for correctness, tested, and validated against project requirements per GitLab's AI contribution guidelines.