Measured boot attestation with UKI policy by phaer · Pull Request #35 · phaer/nixos-android-builder

phaer · 2026-03-25T23:17:09Z

Switch from raw PCR digest comparison to UEFI event log validation. A custom UKI keylime policy parses the TPMs event log and checks individual events instead of
comparing PCR hashes. This means firmware config changes (boot order, BIOS settings) no longer break attestation as they can be ignored and we have a better change to understand why something changes if it does

Keylime patches:

#1878 (elparsing: check tpm2_eventlog exit code instead of stderr keylime/keylime#1878) — Fix tpm2_eventlog exit-code check (UKI EV_IPL
warnings were breaking attestation)
#1879 (tpm: use only policy's relevant PCRs for event log verification keylime/keylime#1879) — Use get_relevant_pcrs() for event-log replay
(avoids spurious PCR 9/11 failures)
#1880 (verifier: dual ORM mapping on mbpolicies/verifiermain causes stale policy reads and silent accept_attestations no-op keylime/keylime#1880) — Bypass SQLAlchemy dual-mapping cache for
uefi_ref_state and accept_attestations (two patches)
rust-keylime #1223 (fix push-mode: cache UEFI event log bytes at startup to avoid post-privilege-drop failures keylime/rust-keylime#1223) — Agent: cache UEFI event log
bytes at startup (push model)

Other:

nixpkgs bump to nixos-unstable + systemd 259 fixes (TPM2 LUKS, PCR 9 NvPCR)
Restore explicit PCR 7 binding for LUKS partitions
Removed old pcr-policy package (superseded)

Trust model

Agent self-reports its event log on first boot (TOFU). After that, the verifier replays the log against the refstate and validates the TPM quote every cycle. PCR 9 and 11 are NOT in the event log replay (because systemd-pcrphase extends them at runtime) but 11 is still in the TPM quote, while 9 should redundant in our case (UKI boot) - value consistency is enforced across attestations.

Add logLevelOverrides option to suppress noisy per-request INFO logging from keylime.web and keylime.authorization.manager.

Add libefivar for event log enrichment, enable the measured boot policy, send refstate from agent to auto-enrollment server, and update tests for measured boot enrollment.

Custom MBA policy for UKI boot chains (systemd-boot + UKI). Validates SCRTM/firmware (PCR 0), Secure Boot keys (PCR 7), UKI application digest (PCR 4), and UKI PE section measurements (PCR 11). Accepts expected variability in PCR 1. Includes create-uki-refstate tool to generate the reference state from a binary UEFI event log.

Add measuredBootPolicyPath option to specify the directory containing the MBA policy Python module.

Extract event log parsing into a shared Python library. Rename packages for consistency: keylime-uki-policy -> keylime-measured-boot-policy pcr-policy -> measured-boot-state Add libefivar to measure-boot-state wrapper for device path decoding.

Replays the UEFI event log, compares PCRs against the TPM, and diffs refstates to diagnose attestation mismatches.

18 policy tests + 21 library tests, run as nix flake checks.

Apply 4 upstream patches (3 keylime, 1 rust-keylime): - Check tpm2_eventlog exit code instead of stderr - Use policy's get_relevant_pcrs() for PCR replay - Bypass ORM cache for uefi_ref_state - Cache UEFI event log bytes at agent startup Set measured_boot_evaluate=always and add negative attestation test (tampered UKI digest is rejected).

Drop cached measured boot reports for agents that are no longer registered. Without this, removing an agent and rebooting with a new image can race: the daemon re-enrolls with the old refstate before the fresh report arrives, leaving the agent stuck with a wrong UKI digest.

Commands: - status: list agents with registrar/verifier enrollment state - inspect: show detailed agent info including refstate summary - remove: delete agent from verifier and registrar (or 'all') Reads server address from attestation-server.json, client certs from keys/keylime/. Available as 'nix run .#attestation-ctl' and in the dev shell.

systemd services (systemd-tpm2-setup, systemd-pcrphase) extend PCRs from userspace via the TSS2 library. These extensions are not in the UEFI event log but are recorded in /run/log/systemd/tpm2-measure.log. Add parse_userspace_log() to the measured boot library and include those events in replay_pcrs() so that PCR 9 and 11 replays match the actual TPM state.

tpm2_eventlog 5.7 warns about EV_IPL events in PCR 11 because its verify_digests() has no case for PCR 11 (only 8, 9, 12, 14). Upstream master (b25c9220) adds the missing case. No release since 5.7, so we build from git with the bootstrap step inlined.

Define all custom packages once in packages/default.nix and thread them through _module.args.customPackages instead of repeating callPackage in every module and test. Ensures a single tpm2-tools in the system closure without using a nixpkgs overlay.

Show PASS/FAIL/TIMEOUT/PENDING based on consecutive_attestation_failures, last_successful_attestation, and attestation_count directly. The verifier's operational_state can be stale after agent reboots due to ORM commit issues. Also add LAST OK and attestation count columns to the status table.

Add 'save' subcommand that snapshots the current refstate to /var/lib/keylime/saved-refstate.json (the persistent encrypted partition). 'diagnose' now auto-detects a saved refstate when run without --refstate. Remove the 'diff' subcommand — its functionality is now available via 'diagnose old.json new.json' with two positional arguments. diagnose already handled the offline case (no TPM sysfs) gracefully.

Show the save-then-diagnose workflow, document auto-detection of saved refstates, and replace the removed 'diff' subcommand with 'diagnose old.json new.json'.

Use db_manager.session() directly instead of session_context() in the uefi_ref_state property. session_context() calls commit() on exit, which flushes pending EvidenceItemMapping objects on the shared scoped session, causing SQLAlchemy identity map conflicts.

The verifier maps both `verifiermain` and `mbpolicies` via two independent SQLAlchemy ORM classes: - keylime.db.verifier_db (declarative_base, own MetaData) used by push_agent_monitor and cloud_verifier_tornado - keylime.models.verifier (model framework, own registry) used by tpm_engine and the push-mode attestation flow Each has its own identity map and change tracking; SQLAlchemy has no way to synchronize them. Writes through one mapping are invisible to the other's cached instances, breaking both reads and writes that cross the boundary. Replace the previous narrowly-scoped 0003 with two patches that issue raw SELECT/UPDATE bypassing both ORM layers: - 0003: read mb_policy via raw SELECT (fixes stale policy after tenant DELETE+CREATE re-enrollment). - 0004: write accept_attestations via raw UPDATE (fixes push-mode timeout recovery — previously the column was silently dropped from the ORM UPDATE because the model framework's loaded state still showed True after push_agent_monitor flipped it to False via the legacy mapping). The proper upstream fix is to consolidate to a single mapping per table; until then this is the only mechanism that crosses the mapping boundary.

Stop the agent long enough for push_agent_monitor to fire its timeout (quote_interval x 5 = 10s with the test config), verify the verifier marks the agent FAIL via attestation_status, then restart the agent and verify it recovers to PASS within the recovery window. Asserts attestation_count > baseline so a stale PASS reading cannot satisfy the check. This guards against the dual-mapping write bug fixed by the keylime 0004 patch (push-mode self-healing was silently broken upstream — count grew but accept_attestations stayed False forever). Without the fix the test fails at the recovery assertion; with the fix it passes in ~17 seconds.

The existing reboot loop only checked that the verifier still had an enrollment record for agent_uuid, which is a weak signal — the record persists regardless of what the agent does after reboot. Query the registrar after each reboot and assert that results.uuids equals exactly [agent_uuid]. If the swtpm's EPS ever regenerates (state directory lost, TPM2_Clear, etc.) the agent would re-register under a new EK-derived UUID and the registrar would contain two entries, failing the assertion.

The upstream PR branches for the elparsing and tpm-relevant-pcrs fixes have been rebased onto master commits newer than v7.14.1 and have grown test coverage since we first packaged them. Pin keylime to master commit 4c2a0c6ca84c ("Switch from CA organization of MITLL to Keylime", 30 commits past v7.14.1) — the common base of all current branches — and regenerate all four local patches against it. The regenerated patches now include the test additions upstream grew in the meantime (test/test_mba_parsing.py, test/test_tpm_check_pcrs.py, test/test_tpm_engine.py) and the expanded scope of the tpm-relevant-pcrs fix that now threads mb_policy_name through cloud_verifier_common, cloud_verifier_tornado, da/attest and verification/tpm_engine. Both VM tests (keylime and keylime-auto-enroll) still pass on the new base, including the push-mode timeout recovery subtest.

fix/uefi-log-privileged-fd was rebased upstream and the resulting commit now touches 6 files (adding plumbing through attestation.rs, main.rs, state_machine.rs) instead of the 3 in our original snapshot. Regenerate against commit 0d63e3b from the current branch tip. The fetched source tag (v0.2.9) is unchanged; only the patch itself expands in scope.

Link each local patch to its upstream PR (or tracking issue) so the provenance of each is obvious from the package files alone: - 0001 elparsing → keylime/keylime#1878 - 0002 tpm-relevant-pcrs → keylime/keylime#1879 - 0003/0004 dual-mapping → keylime/keylime#1880 (issue, no PR yet) - keylime-agent 0001 → keylime/rust-keylime#1223

systemd 259's new NvPCR support adds a runtime PCR 9 extension in systemd-tpm2-setup.service (an anchoring measurement for the bundled hardware.nvpcr and cryptsetup.nvpcr definitions) that is not captured in the UEFI event log. This makes the event-log-vs-live-PCR replay check fail on every boot and breaks measured boot attestation. Exclude PCR 9 from relevant_pcr_indices, the set used by keylime's mb_pcrs_to_check() to decide which PCRs to replay-check. The event level policy checks still run against whatever PCR 9 events are in the UEFI event log, and the security critical content of PCR 9 (the UKI image) is already pinned via uki_digest in PCR 4. This mirrors how we already handle PCR 11, which is runtime extended by systemd-pcrphase with boot phase strings. Both PCRs share the same problem (userspace extensions invisible to the UEFI event log) and now share the same fix. Alternative fixes considered and rejected: A. Masking the systemd shipped .nvpcr files via /etc/nvpcr/ dead symlinks. Works, but brittle: any future .nvpcr file added by upstream systemd silently re-breaks attestation. B. Capturing runtime extensions in the refstate via userspace_digests and patching the verifier to account for them during replay. Most principled, but the NvPCR anchor measurement is per-host (derived from a local secret), which would defeat image-wide attestation and force per-host refstates. C. Pinning the PCR 9 value via tpm_policy. Same per-host problem. D. Disabling systemd-tpm2-setup.service entirely. Would also break TPM2 bound LUKS unlock, which we rely on for /var/lib/keylime and /var/lib/credentials.

systemd 259 changed how tpm2_get_best_pcr_bank() selects the PCR hash algorithm: it now reads the LoaderTpm2ActivePcrBanks EFI variable (written by the UEFI boot manager via GetActivePcrBanks()). Without TPM2 support compiled into OVMF, GetActivePcrBanks() returns 0, causing systemd to log 'Firmware reports neither SHA1 nor SHA256 PCR banks, cannot operate.' and fail every TPM2 unseal with EOPNOTSUPP. Fix the installer VM test by switching from the default OVMF (no TPM support) to OVMF.override { tpmSupport = true; }. This enables the -D TPM2_ENABLE OVMF build flag so GetActivePcrBanks() correctly reports the active PCR banks to userspace. The integration VM test already used OVMFFull which already has tpmSupport = true. With OVMF TPM support enabled, systemd-tpm2-setup-early (gated on ConditionSecurity=measured-uki, satisfied because the stub can now extend PCRs) creates the ECC SRK at 0x81000001, and tpm2_get_best_pcr_bank() succeeds \u2014 no manual SRK provisioning needed.

systemd 258 bound tpm2-encrypted repart partitions to PCR 7 by default. systemd 259 changed the default to an empty policy (no PCR restrictions), silently dropping the secure-boot binding and allowing any holder of the TPM to unseal the partitions. Make the policy explicit with TPM2PCRs=7, consistent with how individual credentials inside the partition are encrypted (--tpm2-pcrs=7 in credential-storage.nix).

mainly to test a newer kernel on flaky test hardware

…ate flake check Replace the separate libraryTests flake check with pytestCheckHook in the package's checkPhase. This is more idiomatic for nixpkgs Python packages and ensures tests run on every build, not just when explicitly checked. - Enable doCheck (was false) and add pytestCheckHook to nativeCheckInputs - Remove libraryTests from unit-tests.nix and tests/default.nix - Update README to reflect the new testing approach

… form Replace the four pairs of hardcoded UEFI GUID strings with a helper that computes the mixed-endian form by byte-reversing the first three fields of the standard GUID. Each GUID is now defined once in the standard UEFI form; the mixed-endian variant (as seen in some tpm2_eventlog output) is derived automatically. This eliminates the risk of copy-paste errors between the two forms and makes it easier to add new GUIDs in the future. No Python efivar binding exists in nixpkgs, so a lightweight helper is preferable to adding a C library dependency.

Replace the runCommand that copies a single .py file with a buildPythonPackage using pyproject.toml. This enables: - Proper dependency management via setuptools - Tests via pytestCheckHook (replaces the manual PYTHONPATH wiring in the policyTests flake check) - Standard Python packaging conventions (pyproject.toml, etc.) The policyPath output still provides a directory suitable for the verifier's PYTHONPATH, now pointing at the package's site-packages. Pass the custom keylime package through keylime-shared.nix so the policy tests can import from keylime.mba.elchecking.

…tations - user-guide.md: enrollment diagram said 'PCRs 0,1,2,3,7,11' but the daemon passes --mb_refstate with no explicit PCR list; the uki policy determines replay via get_relevant_pcrs() = {0,1,2,3,4,5,7}. PCR 11 is excluded from replay; PCRs 4 and 5 were missing. Replace with '--mb_refstate, uki policy' which is accurate and stable. - keylime-auto-enroll.nix: same fix in the module header comment. - docs.md: remove the 'measured boot & attestation' bullet from Limitations and Further Work — it described the current working implementation, not an outstanding gap. The feature is fully covered in the 'Remote Attestation (Keylime)' section.

Add dispatcher entries for event types seen on real hardware that were missing from the policy, causing attestation to fail with 'unexpected (PCRIndex, EventType) combination': - EV_POST_CODE in PCR 0 and PCR 2: older TCG type used by some firmware for POST code and option ROM measurements. Both PCRs are in relevant_pcr_indices so the quote comparison covers integrity. - EV_EFI_PLATFORM_FIRMWARE_BLOB{,2} in PCR 2: some firmware measures UEFI drivers here rather than PCR 0. Routed to the same platform_firmware_blobs collector as PCR 0 events, consistent with measure-boot-state which collects from all PCRs. - EV_EFI_VARIABLE_BOOT2 in PCR 1: newer UEFI spec variant of EV_EFI_VARIABLE_BOOT; treated the same as existing PCR 1 handlers. - EV_EFI_ACTION in PCR 6: PCR 6 is not in relevant_pcr_indices and is absent from tpm_policy, so the handler prevents the dispatcher from rejecting events without providing an end-to-end integrity guarantee. - EV_SEPARATOR extended from range(8) to range(16): some firmware emits separators for PCRs beyond 7 to mark the end of each measurement phase.

systemd puts the console in UTF-8 mode at boot; the kernel then maps Unicode code points through the font's Unicode table. The default VGA ROM font has no entries for U+2500+, so those glyphs render as '?'. ter-v16n (Terminus) includes the full box-drawing range and is a standard VGA-compatible bitmap font suitable for the Linux console.

phaer changed the title ~~Mb policy~~ Measured boot attestation with UKI policy Mar 25, 2026

phaer added 29 commits April 13, 2026 14:28

keylime: add per-logger log level overrides

bf409b4

Add logLevelOverrides option to suppress noisy per-request INFO logging from keylime.web and keylime.authorization.manager.

keylime: enable measured boot attestation with uki policy

0b70ed7

Add libefivar for event log enrichment, enable the measured boot policy, send refstate from agent to auto-enrollment server, and update tests for measured boot enrollment.

docs: document measured boot attestation and uki policy

61dcec0

keylime: make measured boot policy path configurable

994acad

Add measuredBootPolicyPath option to specify the directory containing the MBA policy Python module.

add debug-measured-boot-state tool

c2cf867

Replays the UEFI event log, compares PCRs against the TPM, and diffs refstates to diagnose attestation mismatches.

tests: add unit tests for uki policy and measured boot library

225f3c5

18 policy tests + 21 library tests, run as nix flake checks.

measured-boot: include userspace TPM events in refstate reporting

80f3e44

docs: update debug-measured-boot-state usage and examples

4690c85

Show the save-then-diagnose workflow, document auto-detection of saved refstates, and replace the removed 'diff' subcommand with 'diagnose old.json new.json'.

update to nixos-unstable

3219abe

mainly to test a newer kernel on flaky test hardware

phaer added 5 commits April 13, 2026 14:28

tests: inline policyDir let binding in unit-tests.nix

0d3b699

phaer force-pushed the mb-policy branch from 50d54a3 to 0d3b699 Compare April 13, 2026 12:30

chore: run treefmt

b27f5c5

phaer marked this pull request as ready for review April 13, 2026 12:57

phaer added 4 commits April 15, 2026 12:39

registrar: stop exposing non-TLS port 8890

fabf0c3

docs: add keylime server setup guide

3214294

phaer merged commit a3c0079 into main Apr 15, 2026
2 checks passed

phaer deleted the mb-policy branch April 15, 2026 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Measured boot attestation with UKI policy#35

Measured boot attestation with UKI policy#35
phaer merged 39 commits intomainfrom
mb-policy

phaer commented Mar 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

phaer commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

phaer commented Mar 25, 2026 •

edited

Loading