Conversation
Add logLevelOverrides option to suppress noisy per-request INFO logging from keylime.web and keylime.authorization.manager.
Add libefivar for event log enrichment, enable the measured boot policy, send refstate from agent to auto-enrollment server, and update tests for measured boot enrollment.
Custom MBA policy for UKI boot chains (systemd-boot + UKI). Validates SCRTM/firmware (PCR 0), Secure Boot keys (PCR 7), UKI application digest (PCR 4), and UKI PE section measurements (PCR 11). Accepts expected variability in PCR 1. Includes create-uki-refstate tool to generate the reference state from a binary UEFI event log.
Add measuredBootPolicyPath option to specify the directory containing the MBA policy Python module.
Extract event log parsing into a shared Python library. Rename packages for consistency: keylime-uki-policy -> keylime-measured-boot-policy pcr-policy -> measured-boot-state Add libefivar to measure-boot-state wrapper for device path decoding.
Replays the UEFI event log, compares PCRs against the TPM, and diffs refstates to diagnose attestation mismatches.
18 policy tests + 21 library tests, run as nix flake checks.
Apply 4 upstream patches (3 keylime, 1 rust-keylime): - Check tpm2_eventlog exit code instead of stderr - Use policy's get_relevant_pcrs() for PCR replay - Bypass ORM cache for uefi_ref_state - Cache UEFI event log bytes at agent startup Set measured_boot_evaluate=always and add negative attestation test (tampered UKI digest is rejected).
Drop cached measured boot reports for agents that are no longer registered. Without this, removing an agent and rebooting with a new image can race: the daemon re-enrolls with the old refstate before the fresh report arrives, leaving the agent stuck with a wrong UKI digest.
Commands: - status: list agents with registrar/verifier enrollment state - inspect: show detailed agent info including refstate summary - remove: delete agent from verifier and registrar (or 'all') Reads server address from attestation-server.json, client certs from keys/keylime/. Available as 'nix run .#attestation-ctl' and in the dev shell.
systemd services (systemd-tpm2-setup, systemd-pcrphase) extend PCRs from userspace via the TSS2 library. These extensions are not in the UEFI event log but are recorded in /run/log/systemd/tpm2-measure.log. Add parse_userspace_log() to the measured boot library and include those events in replay_pcrs() so that PCR 9 and 11 replays match the actual TPM state.
tpm2_eventlog 5.7 warns about EV_IPL events in PCR 11 because its verify_digests() has no case for PCR 11 (only 8, 9, 12, 14). Upstream master (b25c9220) adds the missing case. No release since 5.7, so we build from git with the bootstrap step inlined.
Define all custom packages once in packages/default.nix and thread them through _module.args.customPackages instead of repeating callPackage in every module and test. Ensures a single tpm2-tools in the system closure without using a nixpkgs overlay.
Show PASS/FAIL/TIMEOUT/PENDING based on consecutive_attestation_failures, last_successful_attestation, and attestation_count directly. The verifier's operational_state can be stale after agent reboots due to ORM commit issues. Also add LAST OK and attestation count columns to the status table.
Add 'save' subcommand that snapshots the current refstate to /var/lib/keylime/saved-refstate.json (the persistent encrypted partition). 'diagnose' now auto-detects a saved refstate when run without --refstate. Remove the 'diff' subcommand — its functionality is now available via 'diagnose old.json new.json' with two positional arguments. diagnose already handled the offline case (no TPM sysfs) gracefully.
Show the save-then-diagnose workflow, document auto-detection of saved refstates, and replace the removed 'diff' subcommand with 'diagnose old.json new.json'.
Use db_manager.session() directly instead of session_context() in the uefi_ref_state property. session_context() calls commit() on exit, which flushes pending EvidenceItemMapping objects on the shared scoped session, causing SQLAlchemy identity map conflicts.
The verifier maps both `verifiermain` and `mbpolicies` via two
independent SQLAlchemy ORM classes:
- keylime.db.verifier_db (declarative_base, own MetaData)
used by push_agent_monitor and cloud_verifier_tornado
- keylime.models.verifier (model framework, own registry)
used by tpm_engine and the push-mode attestation flow
Each has its own identity map and change tracking; SQLAlchemy
has no way to synchronize them. Writes through one mapping are
invisible to the other's cached instances, breaking both reads
and writes that cross the boundary.
Replace the previous narrowly-scoped 0003 with two patches that
issue raw SELECT/UPDATE bypassing both ORM layers:
- 0003: read mb_policy via raw SELECT (fixes stale policy after
tenant DELETE+CREATE re-enrollment).
- 0004: write accept_attestations via raw UPDATE (fixes
push-mode timeout recovery — previously the column was
silently dropped from the ORM UPDATE because the model
framework's loaded state still showed True after
push_agent_monitor flipped it to False via the legacy mapping).
The proper upstream fix is to consolidate to a single mapping per
table; until then this is the only mechanism that crosses the
mapping boundary.
Stop the agent long enough for push_agent_monitor to fire its timeout (quote_interval x 5 = 10s with the test config), verify the verifier marks the agent FAIL via attestation_status, then restart the agent and verify it recovers to PASS within the recovery window. Asserts attestation_count > baseline so a stale PASS reading cannot satisfy the check. This guards against the dual-mapping write bug fixed by the keylime 0004 patch (push-mode self-healing was silently broken upstream — count grew but accept_attestations stayed False forever). Without the fix the test fails at the recovery assertion; with the fix it passes in ~17 seconds.
The existing reboot loop only checked that the verifier still had an enrollment record for agent_uuid, which is a weak signal — the record persists regardless of what the agent does after reboot. Query the registrar after each reboot and assert that results.uuids equals exactly [agent_uuid]. If the swtpm's EPS ever regenerates (state directory lost, TPM2_Clear, etc.) the agent would re-register under a new EK-derived UUID and the registrar would contain two entries, failing the assertion.
The upstream PR branches for the elparsing and tpm-relevant-pcrs fixes
have been rebased onto master commits newer than v7.14.1 and have grown
test coverage since we first packaged them. Pin keylime to master
commit 4c2a0c6ca84c ("Switch from CA organization of MITLL to Keylime",
30 commits past v7.14.1) — the common base of all current branches —
and regenerate all four local patches against it.
The regenerated patches now include the test additions upstream grew
in the meantime (test/test_mba_parsing.py, test/test_tpm_check_pcrs.py,
test/test_tpm_engine.py) and the expanded scope of the tpm-relevant-pcrs
fix that now threads mb_policy_name through cloud_verifier_common,
cloud_verifier_tornado, da/attest and verification/tpm_engine.
Both VM tests (keylime and keylime-auto-enroll) still pass on the new
base, including the push-mode timeout recovery subtest.
fix/uefi-log-privileged-fd was rebased upstream and the resulting commit now touches 6 files (adding plumbing through attestation.rs, main.rs, state_machine.rs) instead of the 3 in our original snapshot. Regenerate against commit 0d63e3b from the current branch tip. The fetched source tag (v0.2.9) is unchanged; only the patch itself expands in scope.
Link each local patch to its upstream PR (or tracking issue) so the provenance of each is obvious from the package files alone: - 0001 elparsing → keylime/keylime#1878 - 0002 tpm-relevant-pcrs → keylime/keylime#1879 - 0003/0004 dual-mapping → keylime/keylime#1880 (issue, no PR yet) - keylime-agent 0001 → keylime/rust-keylime#1223
systemd 259's new NvPCR support adds a runtime PCR 9 extension in
systemd-tpm2-setup.service (an anchoring measurement for the bundled
hardware.nvpcr and cryptsetup.nvpcr definitions) that is not captured
in the UEFI event log. This makes the event-log-vs-live-PCR replay
check fail on every boot and breaks measured boot attestation.
Exclude PCR 9 from relevant_pcr_indices, the set used by keylime's
mb_pcrs_to_check() to decide which PCRs to replay-check. The event
level policy checks still run against whatever PCR 9 events are in
the UEFI event log, and the security critical content of PCR 9 (the
UKI image) is already pinned via uki_digest in PCR 4.
This mirrors how we already handle PCR 11, which is runtime extended
by systemd-pcrphase with boot phase strings. Both PCRs share the
same problem (userspace extensions invisible to the UEFI event log)
and now share the same fix.
Alternative fixes considered and rejected:
A. Masking the systemd shipped .nvpcr files via /etc/nvpcr/ dead
symlinks. Works, but brittle: any future .nvpcr file added by
upstream systemd silently re-breaks attestation.
B. Capturing runtime extensions in the refstate via userspace_digests
and patching the verifier to account for them during replay.
Most principled, but the NvPCR anchor measurement is per-host
(derived from a local secret), which would defeat image-wide
attestation and force per-host refstates.
C. Pinning the PCR 9 value via tpm_policy. Same per-host problem.
D. Disabling systemd-tpm2-setup.service entirely. Would also break
TPM2 bound LUKS unlock, which we rely on for /var/lib/keylime
and /var/lib/credentials.
systemd 259 changed how tpm2_get_best_pcr_bank() selects the PCR hash
algorithm: it now reads the LoaderTpm2ActivePcrBanks EFI variable
(written by the UEFI boot manager via GetActivePcrBanks()). Without TPM2
support compiled into OVMF, GetActivePcrBanks() returns 0, causing
systemd to log 'Firmware reports neither SHA1 nor SHA256 PCR banks,
cannot operate.' and fail every TPM2 unseal with EOPNOTSUPP.
Fix the installer VM test by switching from the default OVMF (no TPM
support) to OVMF.override { tpmSupport = true; }. This enables the
-D TPM2_ENABLE OVMF build flag so GetActivePcrBanks() correctly reports
the active PCR banks to userspace. The integration VM test already used
OVMFFull which already has tpmSupport = true.
With OVMF TPM support enabled, systemd-tpm2-setup-early (gated on
ConditionSecurity=measured-uki, satisfied because the stub can now
extend PCRs) creates the ECC SRK at 0x81000001, and tpm2_get_best_pcr_bank()
succeeds \u2014 no manual SRK provisioning needed.
systemd 258 bound tpm2-encrypted repart partitions to PCR 7 by default. systemd 259 changed the default to an empty policy (no PCR restrictions), silently dropping the secure-boot binding and allowing any holder of the TPM to unseal the partitions. Make the policy explicit with TPM2PCRs=7, consistent with how individual credentials inside the partition are encrypted (--tpm2-pcrs=7 in credential-storage.nix).
mainly to test a newer kernel on flaky test hardware
…ate flake check Replace the separate libraryTests flake check with pytestCheckHook in the package's checkPhase. This is more idiomatic for nixpkgs Python packages and ensures tests run on every build, not just when explicitly checked. - Enable doCheck (was false) and add pytestCheckHook to nativeCheckInputs - Remove libraryTests from unit-tests.nix and tests/default.nix - Update README to reflect the new testing approach
… form Replace the four pairs of hardcoded UEFI GUID strings with a helper that computes the mixed-endian form by byte-reversing the first three fields of the standard GUID. Each GUID is now defined once in the standard UEFI form; the mixed-endian variant (as seen in some tpm2_eventlog output) is derived automatically. This eliminates the risk of copy-paste errors between the two forms and makes it easier to add new GUIDs in the future. No Python efivar binding exists in nixpkgs, so a lightweight helper is preferable to adding a C library dependency.
Replace the runCommand that copies a single .py file with a buildPythonPackage using pyproject.toml. This enables: - Proper dependency management via setuptools - Tests via pytestCheckHook (replaces the manual PYTHONPATH wiring in the policyTests flake check) - Standard Python packaging conventions (pyproject.toml, etc.) The policyPath output still provides a directory suitable for the verifier's PYTHONPATH, now pointing at the package's site-packages. Pass the custom keylime package through keylime-shared.nix so the policy tests can import from keylime.mba.elchecking.
…tations
- user-guide.md: enrollment diagram said 'PCRs 0,1,2,3,7,11' but the
daemon passes --mb_refstate with no explicit PCR list; the uki policy
determines replay via get_relevant_pcrs() = {0,1,2,3,4,5,7}.
PCR 11 is excluded from replay; PCRs 4 and 5 were missing. Replace
with '--mb_refstate, uki policy' which is accurate and stable.
- keylime-auto-enroll.nix: same fix in the module header comment.
- docs.md: remove the 'measured boot & attestation' bullet from
Limitations and Further Work — it described the current working
implementation, not an outstanding gap. The feature is fully covered
in the 'Remote Attestation (Keylime)' section.
Add dispatcher entries for event types seen on real hardware that were
missing from the policy, causing attestation to fail with 'unexpected
(PCRIndex, EventType) combination':
- EV_POST_CODE in PCR 0 and PCR 2: older TCG type used by some firmware
for POST code and option ROM measurements. Both PCRs are in
relevant_pcr_indices so the quote comparison covers integrity.
- EV_EFI_PLATFORM_FIRMWARE_BLOB{,2} in PCR 2: some firmware measures
UEFI drivers here rather than PCR 0. Routed to the same
platform_firmware_blobs collector as PCR 0 events, consistent with
measure-boot-state which collects from all PCRs.
- EV_EFI_VARIABLE_BOOT2 in PCR 1: newer UEFI spec variant of
EV_EFI_VARIABLE_BOOT; treated the same as existing PCR 1 handlers.
- EV_EFI_ACTION in PCR 6: PCR 6 is not in relevant_pcr_indices and is
absent from tpm_policy, so the handler prevents the dispatcher from
rejecting events without providing an end-to-end integrity guarantee.
- EV_SEPARATOR extended from range(8) to range(16): some firmware emits
separators for PCRs beyond 7 to mark the end of each measurement phase.
systemd puts the console in UTF-8 mode at boot; the kernel then maps Unicode code points through the font's Unicode table. The default VGA ROM font has no entries for U+2500+, so those glyphs render as '?'. ter-v16n (Terminus) includes the full box-drawing range and is a standard VGA-compatible bitmap font suitable for the Linux console.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Switch from raw PCR digest comparison to UEFI event log validation. A custom UKI keylime policy parses the TPMs event log and checks individual events instead of
comparing PCR hashes. This means firmware config changes (boot order, BIOS settings) no longer break attestation as they can be ignored and we have a better change to understand why something changes if it does
Keylime patches:
warnings were breaking attestation)
(avoids spurious PCR 9/11 failures)
uefi_ref_state and accept_attestations (two patches)
bytes at startup (push model)
Other:
Trust model
Agent self-reports its event log on first boot (TOFU). After that, the verifier replays the log against the refstate and validates the TPM quote every cycle. PCR 9 and 11 are NOT in the event log replay (because systemd-pcrphase extends them at runtime) but 11 is still in the TPM quote, while 9 should redundant in our case (UKI boot) - value consistency is enforced across attestations.