Skip to content

Draft PR - "Margo Identity and Authorization Framework" (MIAF) SUP#38

Draft
matlec wants to merge 39 commits into
mainfrom
feat/miaf-sup
Draft

Draft PR - "Margo Identity and Authorization Framework" (MIAF) SUP#38
matlec wants to merge 39 commits into
mainfrom
feat/miaf-sup

Conversation

@matlec
Copy link
Copy Markdown
Contributor

@matlec matlec commented Feb 5, 2026

First draft of the "Margo Identity and Authorization Framework" (MIAF) SUP for community feedback.

Background: Thoughts on Identity and Interoperability in Margo - From PR1 to GA

@phil-abb
Copy link
Copy Markdown
Contributor

phil-abb commented Feb 5, 2026

@margo/technical-wg - A new SUB has been started. Please take a look and talk with @matlec if you're interested in helping develop this SUP.

@phil-abb
Copy link
Copy Markdown
Contributor

@matlec I'm not too familiar with SPIFFE yet. How much of this will work out of the box with an existing SPIFFE implementation, and how much of this is a custom Margo implementation?

Comment thread proposals/margo-identity-and-authorization-framework.md
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
matlec added 3 commits March 19, 2026 15:46
This change also relates to margo/specification#146

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
- **Onboarding:** PR1's WFM-centric onboarding flow (`POST /api/v1/onboarding`) is replaced by MIAF's bootstrap and enrollment mechanism (`POST /api/v1/identities`), which binds a device's Physical Device Identity to a Logical Device Identity within the Trust Domain.
- **Trust anchor distribution:** PR1's per-WFM root CA endpoint (`GET /api/v1/onboarding/certificate`) is replaced by the SPIFFE Trust Bundle, retrieved from a standardized Trust Bundle endpoint and distributed in the SPIFFE Bundle Map format.
- **Cryptographic requirements:** PR1's permitted signature algorithms are superseded by MIAF's cryptographic requirements.
- **Device security requirements:** PR1's informational references to hardware key protection (TPM, secure boot, attestation) become normative requirements under MIAF's [Device Key Protection](#device-key-protection) section.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With making this normative, i.e. mandatory, we're ruling out devices that cannot fulfill hardware trust anchor configurations such as constrained devices. Alike, there is no "standard" TPM (interface) on ARM systems as there is for x86. Furthermore, with this, you may encounter problems in certain markets.

So, what about, e.g., introducing a flag trusted || untrusted for devices managed in Margo? With this, you immediately see the trust status of devices but are still able to manage them. Making this mandatory may also prevent Margo adoption / transformation of brownfield scenarios in which field devices simply cannot be upgraded or even (re-)configured to meet this requirement.

Copy link
Copy Markdown
Contributor Author

@matlec matlec Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a valid concern. The current "Device Key Protection" section already allows storage of private keys without hardware protection:

- Keys **MUST** be generated and stored in secure hardware (TPM, Secure Element, or TEE) where available and **MUST NOT** be exportable.
- Where only software storage is possible, implementations **MUST** provide at-rest encryption, integrity protection, and OS/process isolation (e.g., dedicated key service with strict ACLs).

...but: even the (admittedly lax) requirements for software storage might be too much for constrained devices.

I think the right direction is something similar to what you're proposing: what do you think about introducing key protection classes that the device reports during enrollment? Something like:

Class Description Examples
hardware Key in dedicated secure hardware, non-exportable TPM, Secure Element, ARM TrustZone, ...
software-isolated Key in software with OS-level process isolation and at-rest encryption Linux with keyring, ...
software-basic Key in software without isolation Bare-metal MCU, ...

This would give the MIS and other components an indication on the security level supported by the device and allow components to have different policies based on different protection levels.

What do you think? Are those the "right" classes? Would a device simply report its class or we need any kind of attestation evidence? And should those classes drive policies (e.g., SVID lifetime, ...) or should this be deployment-specific? Open for any feedback...

Comment on lines +277 to +278
Authentication and authorization decisions are performed directly using these identities (for example, mTLS with an **X.509 SVID**).
Where mTLS is not feasible, a short-lived **JWT SVID** may be issued as a bearer credential. Optional mappings to OAuth 2.0 or enterprise token infrastructures are provided in an [informative appendix](#appendix-c-oauth2-and-api-gateway-interoperability-informative) and are **not required for compliance**.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what is now normative and what is optional? In other words, on what technologies / standards do we base upon and mark them as mandatory?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does my comment to @phil-abb's comment answer your question? In response to another comment from @phil-abb, I defined the "Factory Certificate Method (mTLS)" as the interoperability baseline in the SUP.

Address @phil-abb's review comments about the relationship between MIAF and
SPIFFE, the use of Margo in terminology, and the boundary between
adopted standards and Margo-specific content.
Framing and structure:
- Add implementation-neutrality statement (MIS is a deployment role,
  not a Margo-provided service)
- Add "Relationship to SPIFFE" section with classification table
  distinguishing adopted, constrained, and Margo-specific content
- Explicitly state that MIS APIs are Margo-specific lifecycle
  interfaces, not the SPIFFE Workload API model

Terminology:
- Rename "Margo Trust Domain" to "Trust Domain" (SPIFFE term)
- Remove "Margo Identity (MI)" as a standalone defined term
- Remove "Margo Trust Bundle" and "SPIFFE-conformant" where imprecise
- Strengthen MIS definition as a deployment-local conformance role
- Split terminology into "adopted from SPIFFE" and "introduced by
  this SUP" categories

SPIFFE alignment:
- Rewrite SVID profile sections as adopt-by-reference + Margo deltas
- Add Source column to profile tables showing SPIFFE vs MIAF origin
- Align X.509 EKU with SPIFFE (defer to SPIFFE rules, not override)
- Align KeyUsage with SPIFFE (allow keyEncipherment/keyAgreement)
- Drop JWT iss and Lifetime as MIAF profile constraints
- Relabel SPIFFE-baseline validation rules appropriately
- Tighten federation wording to not overstate SPIFFE Federation adoption
- Fix interoperability statement to reference libraries/tooling, not
  a reusable SPIFFE control plane

Editorial:
- Drop redundant closing paragraphs from profile sections
- Simplify JWT SVID validation to reference exchange endpoint for
  lifetime guidance
- Add informative labels to framework overview sequence and
  alternatives section

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
@matlec
Copy link
Copy Markdown
Contributor Author

matlec commented Mar 24, 2026

@matlec I'm not too familiar with SPIFFE yet. How much of this will work out of the box with an existing SPIFFE implementation, and how much of this is a custom Margo implementation?

The SUP reuses SPIFFE for identity naming (SPIFFE IDs), credential formats (X.509-SVID, JWT-SVID), and trust distribution (Trust Bundles). Existing SPIFFE libraries and tooling can be used for SVID validation, Trust Bundle handling, and SPIFFE ID processing. Everything else - device bootstrap methods, the discovery document, enrollment / renewal / revocation APIs, the LDI / PDI / ESI binding model, and device replacement / lifecycle rules - is Margo-specific and would need to be implemented on top. The revised draft now includes a "Relationship to SPIFFE" section with a classification table that makes this boundary explicit.

- The **pluggable bootstrap mechanism** supports multiple Physical Device Identity proofs (FIDO Device Onboard, IEEE 802.1AR DevID, factory certificates), ensuring wide hardware and supply-chain coverage.
- All supported bootstrap methods converge to the same Logical Device Identity, allowing operators to:

- start with existing factory credentials, and
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Factory certificates / credentials means pre-shared authentication "material" such as pre-shared keys or pre-shared secrets?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the confusion here is partly caused by my wording. In the current SUP, "factory credentials" was intended to refer to X.509 certificates used by the "Factory Certificate" bootstrap methods, not to PSKs or symmetric secrets. But I agree that "credentials" is broader - I will fix that in the SUP.

Your comment also made me reflect on the term "factory certificate" itself. Right now that wording suggests manufacturer-provisioned bootstrap material. But the mechanism currently described is closer to "device presents a pre-provisioned certificate that the MIS trusts". This raises the broader question whether this method should be limited to manufacturer-issued certificates, or should we also cover operator-provisioned certificates? In the latter case, we might want to rename the term to e.g. "pre-provisioned bootstrap certificate". What do you think @stormc?

matlec added 7 commits March 25, 2026 11:05
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Model 2 reduces interoperability, model 3 is trivial

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
@matlec
Copy link
Copy Markdown
Contributor Author

matlec commented Mar 26, 2026

Copy link
Copy Markdown
Contributor

@nilanjan-samajdar nilanjan-samajdar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High level comments for now on overall strategy for adopting this SUP and demarcation between the specification and implementation, will review from API perspective on the latest version of the SUP.


#### Workload Fleet Management (WFM) Client (relationship to MIAF) <!-- omit from toc -->

A **Margo client component** as defined in the [Technical Lexicon](https://specification.margo.org/personas-and-definitions/technical-lexicon/). While a WFM Client runs on an Edge Compute Device, its identity represents the deployed **client instance**, not the device itself. The **Logical Device Identity** defined in this SUP provides the stable, hardware-bound identity of the device. A planned **WFM Client Identity Profile** will define how WFM Clients obtain their own distinct Margo Identities, building on the device identity as their authentication foundation. This separation is necessary because device identity and WFM Client identity have different lifecycles, authorization scopes, and cardinalities across topologies (standalone devices, Kubernetes clusters, device gateways).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this statement, It is not clear if "WFM Client Identity" is within the scope of the IMS in this SUP.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nilanjan-samajdar! Please review the Future Work section that I added to the SUP since you left your comment. Does that resolve your question?

*Example (device profile):* from a verified PDI, the ESI may be the certificate fingerprint, or a hash derived from a device certificate contained in an FDO Ownership Voucher.
ESIs **MUST** be stable and unique within the Trust Domain and **MUST NOT** be reversible to the original credential material.

#### SPIFFE Verifiable Identity Document (SVID) <!-- omit from toc -->
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this definition is needed for the 'Margo Identity' to function with the SPIFFE framework ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure I understood your question. The ESI serves multiple purposes in the framework:

  1. It enables idempotent enrollment of devices / identities (it makes enrollment safe to retry across network failures without creating duplicate identities).
  2. It gives the MIS a single consistent key to look up whether it has seen this device before regardless of how the device is authenticated (per bootstrap method).
  3. It enables swapping the hardware while preserving the logical identity. A replacement device brings a new PDI (which will produce a different ESI), which can then be bound to an existing LDI.

I think of an ESI as a stable, bootstrap method-independent, privacy-preserving key that lets the MIS answer the question: "is this the same physical device?" without being coupled to any particular bootstrap method / protocol.


#### Enrollment Subject Identifier (ESI) <!-- omit from toc -->

A deterministic, globally unique identifier derived by the MIS from the presented **Bootstrap Credential** during enrollment.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the ESI is internal to the MIS, where it determines if the device bootstrap credential is already known, hence should the ESI be kept internal to the MIS ?
However, MARGO functions might need the ESI for some use-cases, for example -

  • In case a device is being ESI 'expires', the WFM may need to be informed and the applications 'drained' from the device
  • In case the ESI is re-assigned, the WFM should do a full state-sync

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ESI should remain internal to the MIS as it is an concept of the enrollment layer, and exposing it would couple other components to bootstrap internals. That said, I agree that the use cases you describe are real operational needs. But I'd rather address them via LDI lifecycle events (revocation, re-issuance). The WFM cares that spiffe://.../margo/device/ was revoked or re-issued, not which ESI was involved.

For now, I'd treat this as a gap that can be addressed by future work (e.g., by defining a lifecycle event/notification mechanism). Or do you think this is critical to include in a first iteration?


1. **Discovery:** A Margo component locates MIS endpoints and Trust Bundle locations via the `.well-known/margo` discovery document defined in this SUP.
2. **Enrollment:** The component presents a **Bootstrap Credential** (per its Bootstrap Method) to the MIS and receives an **SVID** for its identity.
3. **Renewal:** Before expiry, the component renews its SVID via an authenticated request (for example, mTLS using the current SVID).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFM actions on 'Expiry' and 'Renewal' should be defined.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment in this thread.


Components such as DFMs, WFMs, their clients, and telemetry agents act as:

- **SVID holders**, presenting their SVIDs during mTLS authentication or when requesting a short-lived **JWT SVID**; and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that SPIFFE and it's SVID are being adopted as the standard for MARGO, but the text earlier pointed to a more generic specification for MIAF where one of the possible implementations is SPIFFE

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the SUP since you left your comment to clarify the relationship with SPIFFE and make it more explicit. Please check if this updated section is sufficient / addresses your comment.

- root and intermediate certificates used to validate X.509 SVID chains; and
- public keys used to verify JWT SVIDs (if used).

Trust Bundles are identified by their Trust Domain name and distributed using the SPIFFE **Bundle Map mechanism**, as defined in the [SPIFFE Trust Domain and Bundle specification](https://github.com/spiffe/spiffe/blob/main/standards/SPIFFE_Trust_Domain_and_Bundle.md).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a separate Trust-Domain for WFM services accessing the MIAF and a separate one for the Devices ? Is there a suggested mapping to topology ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, there is a single trust domain for an entire deployment which would include WFM-related identities (once we have established a MIAF-compatible concept for them, see Future Work)


During enrollment, the component authenticates using its **Bootstrap Credential** and requests issuance of a new identity, represented by an SVID.
For Edge Compute Devices, this operation establishes the authoritative binding between the device's **Physical Device Identity** and **Logical Device Identity** within the Trust Domain.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the Data-Model do you plan to provide a LinkML version ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a plan for now, but if we agree on Silvano's SUP, I will not resist to update any LinkML specs while integrating this SUP into the spec once/if approved :)

| `bootstrapCredential.method` | string | Y | URN uniquely identifying the bootstrap method (e.g., `urn:margo:bootstrap:factory-cert-jwt:v1`). |
| `bootstrapCredential.proof` | object | N | Method-specific proof of possession (e.g., a signed JWT assertion or an mTLS client certificate chain). Present only if the bootstrap method requires explicit proof material. |

**Response body schema (`201 Created` or `200 OK`, `application/json`):**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to provide a Swagger version for the Data-Model and OpenAPI ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my answer in this thread

matlec added 3 commits April 15, 2026 15:45
…on FDO method

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
IEEE 802.1AR defines a credential format (IDevID), not an onboarding
protocol. At the wire level, an IDevID is a manufacturer-issued X.509
certificate - mechanically identical to what the factory-cert-mtls and
factory-cert-jwt methods already accept. IDevID-backed certificate
chains can also appear in FDO Ownership Vouchers.

Rather than defining a separate bootstrap method that duplicates the
existing factory certificate flows, replace the TODO placeholder with
an informative section explaining how IDevIDs work with each existing
method and note the cryptographic algorithm compatibility constraints
between 802.1AR-2018 signature suites and MIAF requirements.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Add a bootstrap method based on one-time enrollment tokens for brownfield,
constrained, and low-cost devices without manufacturer-issued certificates.

Align the core normative model around Bootstrap Credential and
method-derived ESI rather than universal PDI binding, so the framework
accommodates both PDI-based and non-PDI bootstrap methods consistently.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
@matlec
Copy link
Copy Markdown
Contributor Author

matlec commented Apr 16, 2026

Hey @margo/technical-wg!

I made a couple of changes to the SUP over the last days and would love to hear your feedback.

  1. I completed the integration of the FDO bootstrap method in the framework to support onboarding edge devices based on FIDO Device Onboard.
  2. I removed the IEEE 802.1AR bootstrap method from the SUP. The 802.1AR standard does only define requirements for the format (and secure storage) of a "birth certificate", not an onboarding protocol. Hence, the existing certificate-based bootstrap methods already support the use of IDevID certificates and standardizing on e.g. IDevIDs can be a policy decision within a Trust Domain. I added an informative note to the SUP that reflects this decision.
  3. I added another bootstrap method based on a single-use enrollment token to the SUP to address concerns on how to integrate brownfield, low-cost, and constrained devices. @stormc This is related to your request for bootstrapping devices based on a pre-shared key. Please review the enrollment token method I added and provide feedback whether this addresses the concerns you expressed.

Thanks everyone for the feedback you provided so far!

matlec added 4 commits April 17, 2026 08:40
In the current SUP, the API only defines endpoints for
clients to retrieve revocation state, not manage revocation
lists. Removed those over-claims from the SUP.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
…load API

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
...to align with the FDO profile

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
This commit introduces a normative method for
replacing a device using an operator-issued ticket.
Future SUPs may define additional profiled methods
to support different approaches to device replacement.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
matlec added 9 commits April 17, 2026 15:03
...as per current FDO bootstrap profile

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
…try window is a secure operation

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
…S bootstrap

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
@chrisgclayton
Copy link
Copy Markdown

I am personally in alignment with the concept of SPIFFE. The questions I would have on this are around when doing the outside in interrogation to determine identities do we have any additional that we require for our purposes?

@chrisgclayton
Copy link
Copy Markdown

If we are going to use the SPIFFE id which is generally used for workloads vs. device ids I would recommend we do not recommend it is the same trust domain (device and workload) to ensure we do not have conflicts. An alternative is to use the same format with different scheme (i.e., instead of SPIFFE://trustdomain/path we use MARGO://trustdomain/path).

@matlec
Copy link
Copy Markdown
Contributor Author

matlec commented Apr 23, 2026

I am personally in alignment with the concept of SPIFFE. The questions I would have on this are around when doing the outside in interrogation to determine identities do we have any additional that we require for our purposes?

First of all, thanks for reviewing the SUP! If by

outside in interrogation to determine identities

you mean the attestation step used to determine what identity a caller should receive, then I think that is the right question. My understanding is that SPIFFE gives us the identity primitives, SVID formats, and trust-domain model, but it does not standardize the local interrogation or attestation mechanism itself. This is implementation-specific and in the case of SPIRE addressed by node and workload attestation. For the SUP, the device profile is intentionally operating one layer above that: it defines additional Margo-specific requirements for remote device bootstrap and lifecycle (including discovery, standardized bootstrap methods, binding validated bootstrap evidence to a persistent device identity, renewal, replacement, and revocation). So, yes, we do have additional requirements for our purposes, but they are mainly around device bootstrap and lifecycle rather than changing the SPIFFE identity model itself.

If we later define workload- or WFM Client-style identities (WFM Client identities rather sooner than later ;)), we should be explicit about whether local attestation remains implementation-specific or whether Margo wants to standardize that layer as well.

I hope this addresses your concern...

@matlec
Copy link
Copy Markdown
Contributor Author

matlec commented Apr 23, 2026

If we are going to use the SPIFFE id which is generally used for workloads vs. device ids I would recommend we do not recommend it is the same trust domain (device and workload) to ensure we do not have conflicts. An alternative is to use the same format with different scheme (i.e., instead of SPIFFE://trustdomain/path we use MARGO://trustdomain/path).

I would prefer to keep the spiffe:// scheme rather than introducing a margo:// scheme (actually, I introduced and dropped a margo: scheme in an earlier version of this SUP). The main value of this approach is that we are reusing the existing SPIFFE standard, its concepts, validation semantics, ... so that we benefit from existing tooling and implementations.
Regarding shared trust domains, I actually think devices and workloads being in the same trust domain can be the right model for a deployment. The trust domain should represent the shared security boundary, while different principal types (devices, workloads, WFM clients, ...) are separated by distinct SPIFFE ID path namespaces (e.g. /margo/device/... for devices) and authorization policy. That lets a device use its SVID to authenticate as a device to the MIS or to other workloads, and makes patterns like issuing a WFM Client identity bound to an authenticated device identity straightforward, without requiring separate trust domains or a non-SPIFFE URI scheme. So I wouldn't enforce separation of trust domains per spec, but rather treat that as a deployment decision. But happy to further discuss this and learn more about your concerns and thoughts.

matlec added 4 commits April 24, 2026 10:44
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Relax discovery from a fixed /.well-known/margo endpoint
to a Trust-Domain-specific discovery URL model.

Keep /.well-known/margo as the default convention when
one HTTPS origin serves one Trust Domain, while allowing
other absolute discovery URLs for shared-origin or
tenant-specific deployments.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
@chrisgclayton
Copy link
Copy Markdown

I am personally in alignment with the concept of SPIFFE. The questions I would have on this are around when doing the outside in interrogation to determine identities do we have any additional that we require for our purposes?

First of all, thanks for reviewing the SUP! If by

outside in interrogation to determine identities

you mean the attestation step used to determine what identity a caller should receive, then I think that is the right question. My understanding is that SPIFFE gives us the identity primitives, SVID formats, and trust-domain model, but it does not standardize the local interrogation or attestation mechanism itself. This is implementation-specific and in the case of SPIRE addressed by node and workload attestation. For the SUP, the device profile is intentionally operating one layer above that: it defines additional Margo-specific requirements for remote device bootstrap and lifecycle (including discovery, standardized bootstrap methods, binding validated bootstrap evidence to a persistent device identity, renewal, replacement, and revocation). So, yes, we do have additional requirements for our purposes, but they are mainly around device bootstrap and lifecycle rather than changing the SPIFFE identity model itself.

If we later define workload- or WFM Client-style identities (WFM Client identities rather sooner than later ;)), we should be explicit about whether local attestation remains implementation-specific or whether Margo wants to standardize that layer as well.

I hope this addresses your concern...

Yes I mean the attestation step which to your point is addressed in SPIRE.

I believe in the future we need to get to workload and agree we should be explicit about it.

@chrisgclayton
Copy link
Copy Markdown

If we are going to use the SPIFFE id which is generally used for workloads vs. device ids I would recommend we do not recommend it is the same trust domain (device and workload) to ensure we do not have conflicts. An alternative is to use the same format with different scheme (i.e., instead of SPIFFE://trustdomain/path we use MARGO://trustdomain/path).

I would prefer to keep the spiffe:// scheme rather than introducing a margo:// scheme (actually, I introduced and dropped a margo: scheme in an earlier version of this SUP). The main value of this approach is that we are reusing the existing SPIFFE standard, its concepts, validation semantics, ... so that we benefit from existing tooling and implementations. Regarding shared trust domains, I actually think devices and workloads being in the same trust domain can be the right model for a deployment. The trust domain should represent the shared security boundary, while different principal types (devices, workloads, WFM clients, ...) are separated by distinct SPIFFE ID path namespaces (e.g. /margo/device/... for devices) and authorization policy. That lets a device use its SVID to authenticate as a device to the MIS or to other workloads, and makes patterns like issuing a WFM Client identity bound to an authenticated device identity straightforward, without requiring separate trust domains or a non-SPIFFE URI scheme. So I wouldn't enforce separation of trust domains per spec, but rather treat that as a deployment decision. But happy to further discuss this and learn more about your concerns and thoughts.

I am ok with it being the same trust domain as long as the pathing is clearly different. I want to be sure there is never an authorization decision that is workload related and based on the device identity, or vice versa.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated
Comment thread proposals/margo-identity-and-authorization-framework.md Outdated

Authorization based on verified **SPIFFE IDs** and associated attributes, evaluated locally within the Trust Domain - not on external token scopes.

##### Margo Identity Service (MIS) <!-- omit from toc -->
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a concern about ownership here. The MIS is an implementation that has to be running within the trust domain, but there is no clear expectation that anyone is going to "own" this service since it's meant to be a general service. So, the WFM and DFM vendors can assume someone else is going to provide it, so neither has it, and the customer needs to provide it. Or they both assume they need to provide it, so you end up with two implementations, etc.

Its purpose makes sense in the context of the SUP, but I think we need to figure out the expectations on this. I don't think it makes sense for this to be an official Margo-supported implementation (I guess a SC decision). I think it probably makes sense to include a minimal implementation of this with the reference implementation, but that is different than Margo providing something that will be supported in any production scenarios.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share your concerns. The SUP (deliberately) defines MIS as a role, but that doesn't really resolve the deployment gap you're describing :) The WFM/DFM layer seems the right place for this to live: MIS is the identity plane that devices need before they can talk to anything, and the WFM/DFMs are the management planes devices and clients register with. But nothing in the SUP mandates this - and I also don't think this decision should be buried in a SUP.

Whether Margo should provide a production-quality open-source reference implementation of MIS - similar to how CNCF provides SPIRE for SPIFFE - is a question that belongs to the SC. My personal view: SPIRE is arguably why SPIFFE succeeded in practice, and a prod-quality Margo-maintained implementation would have the same effect on adoption.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @matlec pov on the adoption! The comparison with SPIRE is just perfect. I can offer testing an early solution in an enterprise grade environment with WFM & DFM combination.

matlec added 4 commits May 5, 2026 18:52
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
...and update the PoC to conform to the updated spec

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
matlec added 2 commits May 7, 2026 17:33
Rework the MIAF SUP to deliver a deliberately narrow v0 in time for PR2,
split out a sibling SUP that defines WFM Client identity and updates the
Margo Management Interface, and park the remaining capabilities as
deferred SUPs.

MIAF v0 scope (in this SUP):
- One bootstrap method — Factory Certificate (mTLS) — accepting any
  X.509 client cert chained to a Trust Domain-trusted CA, including
  IEEE 802.1AR IDevIDs and operator-issued certificates (the latter
  supports brownfield deployments without a manufacturer PKI).
- One SVID profile — X.509-SVID.
- Lifecycle phases: enrollment, active, renewal, revocation.
  Revocation and termination are collapsed into a single terminal state.
- APIs: discovery document, Trust Bundle retrieval, enrollment
  (POST /api/v1/identities), SVID renewal.
- Mandatory unknown-fields rule so future SUPs can extend payloads
  without breaking v0 implementations.

Conceptual changes carried into v0:
- Introduce "Principal" and "Identity Profile" as first-class terms;
  MIAF profiles are the extension point for adding new principal classes.
- Remove the Physical Device Identity (PDI) concept; operator-provisioned
  CAs are a first-class bootstrap trust source.
- Drop the mediated vs. direct bootstrap-method distinction.

API field naming:
- All JSON field names converted to camelCase across discovery,
  enrollment, renewal, and error payloads (e.g., svidProfileUri,
  bootstrapCredential, trustDomain, trustBundleUri).
- Discovery document: drop recommendedSvidProfileUri; simplify
  svidProfilesSupported to an array of profile URI strings.

New sibling SUP — WFM Client Identity Profile and Margo Management
Interface Update (proposals/wfm-client-identity-profile.md):
- Defines the WFM Server Identity, Logical WFM Client Identity, WFM
  Client Binding Assertion, wfm-id, client-handle, and Binding Subject.
- WFM Client bootstrap: WFM issues a binding assertion; the client
  enrolls with MIS via the MIAF POST /api/v1/identities endpoint.
- Standalone topology only (cluster topology deferred).
- Margo Management Interface: removes the PR1 onboarding and CA
  distribution endpoints, drops {clientId} from retained paths, and
  replaces RFC 9421 HTTP Message Signatures with mTLS using the WFM
  Client X.509-SVID. RFC 9421 is explicitly listed as rejected.

Deferred SUPs (proposals/deferred/) — additive on top of v0:
- miaf-non-mtls-environments.md — JWT-SVID profile, JWT SVID Exchange
  endpoint, JWT-Bearer auth for renewal, Factory Certificate (JWT
  Assertion) method, for environments with TLS-terminating proxies.
- miaf-fdo-bootstrap-method.md — FIDO Device Onboard bootstrap method.
- miaf-enrollment-token-bootstrap-method.md — operator-issued
  enrollment token bootstrap method.
- miaf-revocation-list.md — MIS revocation-list endpoint and revocation
  model (v0 relies on short SVID lifetimes for revocation-by-expiry).
- miaf-device-replacement.md — LDI rebinding semantics and the
  replacementAuthorization request field.
- miaf-oauth2-bridge.md — OAuth 2.0 Token Exchange bridge mapping
  SVIDs onto access tokens for API-gateway interop (informative).
- miaf-multi-holder-identities-and-cluster-topology.md — multi-holder
  LDI primitive and the WFM Client cluster topology profile.

Reference implementation
(proposals/margo-identity-and-authorization-framework-poc/):
- Update DTOs, discovery/enrollment/renewal/revocation handlers, JWT
  SVID handler, golden response fixtures, and tests to match the
  camelCase API.

Signed-off-by: Matthias Lechner <matthiasl@zededa.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants