Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions mesh-llm/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,6 @@ fn compile_node_proto() {
std::env::set_var("PROTOC", protoc);

prost_build::Config::new()
.compile_protos(&["proto/node.proto"], &["proto"])
.expect("compile node proto");
.compile_protos(&["proto/node.proto", "proto/config.proto"], &["proto"])
.expect("compile mesh and config proto");
}
4 changes: 2 additions & 2 deletions mesh-llm/docs/DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,13 @@ enum NodeRole {
}
```

Roles are exchanged via gossip. Preferred peers use `meshllm.node.v1` protobuf on QUIC ALPN `mesh-llm/1`; legacy peers may still negotiate `mesh-llm/0` and use the older JSON gossip payloads. A node transitions Worker → Host when elected.
Roles are exchanged via gossip. Preferred peers use protobuf on QUIC ALPN `mesh-llm/1`, with `meshllm.node.v1` for mesh state and `meshllm.config.v1` for owner-gated config sync; legacy peers may still negotiate `mesh-llm/0` and use the older JSON gossip payloads. A node transitions Worker → Host when elected.

A newly connected peer is quarantined until it sends a valid `GossipFrame` with `gen = 1` (quarantine-until-gossip admission model). Only streams 0x01 (GOSSIP) and 0x05 (ROUTE_REQUEST) are accepted before admission. All other streams are rejected until the peer is admitted.

## Control-Plane Protocol

The control plane prefers QUIC ALPN `mesh-llm/1` using the `meshllm.node.v1` protobuf schema. Scoped control-plane streams on `/1` use 4-byte LE framing followed by protobuf bytes. For backward compatibility, peers may also negotiate `mesh-llm/0`, which preserves the legacy JSON/raw payloads on those same streams.
The control plane prefers QUIC ALPN `mesh-llm/1` using split protobuf schemas: `meshllm.node.v1` for mesh state and `meshllm.config.v1` for config sync. Scoped control-plane streams on `/1` use 4-byte LE framing followed by protobuf bytes. For backward compatibility, peers may also negotiate `mesh-llm/0`, which preserves the legacy JSON/raw payloads on those same streams.

Mixed meshes containing both `/0` and `/1` nodes are supported. `/0` links are compatibility mode only, so they do not carry protobuf-only fields.

Expand Down
2 changes: 1 addition & 1 deletion mesh-llm/docs/TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -479,7 +479,7 @@ mesh-llm --model Qwen3-Coder-Next-Q4_K_M --auto --no-self-update --split --join

## Control-Plane Protocol (Protobuf v1)

The control plane prefers QUIC ALPN `mesh-llm/1` using the `meshllm.node.v1` protobuf schema. On `/1`, all five scoped control-plane streams use 4-byte LE framing followed by protobuf bytes. For backward compatibility, nodes may also negotiate `mesh-llm/0`, which keeps the legacy JSON/raw payloads on those same streams.
The control plane prefers QUIC ALPN `mesh-llm/1` using split protobuf schemas: `meshllm.node.v1` for mesh state and `meshllm.config.v1` for config sync. On `/1`, all scoped protobuf control-plane streams use 4-byte LE framing followed by protobuf bytes. For backward compatibility, nodes may also negotiate `mesh-llm/0`, which keeps the legacy JSON/raw payloads on those same streams.

| Stream | Type | Format |
|--------|------|--------|
Expand Down
13 changes: 10 additions & 3 deletions mesh-llm/docs/message_protocol.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# mesh-llm Message Protocol

This document describes the wire protocol for control-plane communication between mesh-llm nodes. Control-plane traffic prefers the `meshllm.node.v1` protobuf schema on QUIC ALPN `mesh-llm/1`, with backward-compatible support for the legacy `mesh-llm/0` JSON/raw payloads.
This document describes the wire protocol for control-plane communication between mesh-llm nodes. Control-plane traffic on QUIC ALPN `mesh-llm/1` uses protobuf, split between `meshllm.node.v1` for mesh state and `meshllm.config.v1` for owner-gated config sync, with backward-compatible support for the legacy `mesh-llm/0` JSON/raw payloads.

## ALPN

Expand All @@ -24,12 +24,14 @@ Each QUIC connection carries multiple logical streams, distinguished by a 1-byte
| 0x08 | BLACKBOARD | bidirectional | admission-gated auxiliary channel |
| 0x09 | PLUGIN_CHANNEL | bidirectional | plugin protocol (see Out-of-Scope) |
| 0x0a | PLUGIN_BULK_TRANSFER | send | plugin protocol bulk data (see Out-of-Scope) |
| 0x0b | CONFIG_SUBSCRIBE | bidirectional | protobuf `ConfigSubscribe` / `ConfigSnapshotResponse` / `ConfigUpdateNotification` |
| 0x0c | CONFIG_PUSH | bidirectional | protobuf `ConfigPush` / `ConfigPushResponse` |

Streams 0x02 and 0x04 are raw TCP relay tunnels. They carry llama.cpp RPC and HTTP traffic respectively and are not subject to protobuf framing or generation validation.

## Framing

All protobuf control-plane streams (0x01, 0x03, 0x05, 0x06, 0x07) use the same framing:
All protobuf control-plane streams (0x01, 0x03, 0x05, 0x06, 0x07, 0x0b, 0x0c) use the same framing:

```
[1 byte stream type][4 bytes LE length][N bytes protobuf body]
Expand All @@ -48,13 +50,18 @@ Every protobuf message that carries a `gen` field must have `gen == 1`. Frames w
- `RouteTable.gen`
- `PeerDown.gen`
- `PeerLeaving.gen`
- `ConfigSubscribe.gen`
- `ConfigSnapshotResponse.gen`
- `ConfigUpdateNotification.gen`
- `ConfigPush.gen`
- `ConfigPushResponse.gen`

## Admission (Quarantine-Until-Gossip)

A newly connected peer is quarantined until it sends a valid `GossipFrame` with `gen = 1`. Until admission:

- Only stream 0x01 (GOSSIP) and 0x05 (ROUTE_REQUEST) are accepted.
- All other streams (0x02, 0x03, 0x04, 0x06, 0x07, 0x08, 0x09, 0x0a) are rejected and the stream is closed.
- All other streams (0x02, 0x03, 0x04, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c) are rejected and the stream is closed.
- The QUIC connection itself stays open so gossip can complete.

A peer is admitted when its negotiated gossip payload decodes successfully and passes validation checks. On `/1` this is a protobuf `GossipFrame`; on `/0` this is the legacy JSON gossip payload.
Expand Down
90 changes: 90 additions & 0 deletions mesh-llm/proto/config.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
syntax = "proto3";
package meshllm.config.v1;

message NodeConfigSnapshot {
uint32 version = 1; // config schema version (currently 1)
NodeGpuConfig gpu = 2;
repeated NodeModelEntry models = 3;
repeated NodePluginEntry plugins = 4;
}

enum GpuAssignment {
GPU_ASSIGNMENT_UNSPECIFIED = 0;
GPU_ASSIGNMENT_AUTO = 1;
GPU_ASSIGNMENT_PINNED = 2;
}

message NodeGpuConfig {
GpuAssignment assignment = 1;
}

message ConfiguredModelRef {
string declared_ref = 1;
optional string source_kind = 2;
optional string revision = 3;
}

message NodeModelEntry {
string model = 1;
optional string mmproj = 2;
optional uint32 ctx_size = 3;
optional string gpu_id = 4;
ConfiguredModelRef model_ref = 5;
ConfiguredModelRef mmproj_ref = 6;
}

message NodePluginEntry {
string name = 1;
optional bool enabled = 2;
optional string command = 3;
repeated string args = 4;
}

message ConfigSubscribe {
uint32 gen = 1; // must equal NODE_PROTOCOL_GENERATION
bytes subscriber_id = 2; // 32 bytes — subscribing node's endpoint id
}

message ConfigSnapshotResponse {
uint32 gen = 1; // must equal NODE_PROTOCOL_GENERATION
bytes node_id = 2; // 32 bytes — the node this config belongs to
uint64 revision = 4;
bytes config_hash = 5; // SHA-256 of canonical proto bytes (32 bytes)
NodeConfigSnapshot config = 6;
optional string hostname = 7; // convenience: node hostname for display
optional string error = 8; // set when an error occurred; config/hash/node_id may be empty
}

message ConfigUpdateNotification {
uint32 gen = 1;
bytes node_id = 2;
uint64 revision = 4;
bytes config_hash = 5;
NodeConfigSnapshot config = 6;
}

message ConfigPush {
uint32 gen = 1;
bytes requester_id = 2;
bytes target_node_id = 3;
uint64 expected_revision = 5;
NodeConfigSnapshot config = 6;
bytes owner_signing_public_key = 7;
bytes signature = 8;
}

enum ConfigApplyMode {
CONFIG_APPLY_MODE_UNSPECIFIED = 0;
CONFIG_APPLY_MODE_STAGED = 1;
CONFIG_APPLY_MODE_LIVE = 2;
CONFIG_APPLY_MODE_NOOP = 3;
}

message ConfigPushResponse {
uint32 gen = 1;
bool success = 2;
uint64 current_revision = 3;
bytes config_hash = 4;
optional string error = 5;
ConfigApplyMode apply_mode = 8;
}
88 changes: 0 additions & 88 deletions mesh-llm/proto/node.proto
Original file line number Diff line number Diff line change
Expand Up @@ -225,91 +225,3 @@ enum NodeRole {
HOST = 2;
CLIENT = 3;
}

message NodeConfigSnapshot {
uint32 version = 1; // config schema version (currently 1)
NodeGpuConfig gpu = 2;
repeated NodeModelEntry models = 3;
repeated NodePluginEntry plugins = 4;
}

enum GpuAssignment {
GPU_ASSIGNMENT_UNSPECIFIED = 0;
GPU_ASSIGNMENT_AUTO = 1;
GPU_ASSIGNMENT_PINNED = 2;
}

message NodeGpuConfig {
GpuAssignment assignment = 1;
}

message ConfiguredModelRef {
string declared_ref = 1;
optional string source_kind = 2;
optional string revision = 3;
}

message NodeModelEntry {
string model = 1;
optional string mmproj = 2;
optional uint32 ctx_size = 3;
optional string gpu_id = 4;
ConfiguredModelRef model_ref = 5;
ConfiguredModelRef mmproj_ref = 6;
}

message NodePluginEntry {
string name = 1;
optional bool enabled = 2;
optional string command = 3;
repeated string args = 4;
}

message ConfigSubscribe {
uint32 gen = 1; // must equal NODE_PROTOCOL_GENERATION
bytes subscriber_id = 2; // 32 bytes — subscribing node's endpoint id
}

message ConfigSnapshotResponse {
uint32 gen = 1; // must equal NODE_PROTOCOL_GENERATION
bytes node_id = 2; // 32 bytes — the node this config belongs to
uint64 revision = 4;
bytes config_hash = 5; // SHA-256 of canonical proto bytes (32 bytes)
NodeConfigSnapshot config = 6;
optional string hostname = 7; // convenience: node hostname for display
optional string error = 8; // set when an error occurred; config/hash/node_id may be empty
}

message ConfigUpdateNotification {
uint32 gen = 1;
bytes node_id = 2;
uint64 revision = 4;
bytes config_hash = 5;
NodeConfigSnapshot config = 6;
}

message ConfigPush {
uint32 gen = 1;
bytes requester_id = 2;
bytes target_node_id = 3;
uint64 expected_revision = 5;
NodeConfigSnapshot config = 6;
bytes owner_signing_public_key = 7;
bytes signature = 8;
}

enum ConfigApplyMode {
CONFIG_APPLY_MODE_UNSPECIFIED = 0;
CONFIG_APPLY_MODE_STAGED = 1;
CONFIG_APPLY_MODE_LIVE = 2;
CONFIG_APPLY_MODE_NOOP = 3;
}

message ConfigPushResponse {
uint32 gen = 1;
bool success = 2;
uint64 current_revision = 3;
bytes config_hash = 4;
optional string error = 5;
ConfigApplyMode apply_mode = 8;
}
6 changes: 5 additions & 1 deletion mesh-llm/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,11 @@ mod runtime;
mod system;

pub mod proto {
pub mod node {
pub mod config {
include!(concat!(env!("OUT_DIR"), "/meshllm.config.v1.rs"));
}

pub mod mesh {
include!(concat!(env!("OUT_DIR"), "/meshllm.node.v1.rs"));
}
}
Expand Down
4 changes: 2 additions & 2 deletions mesh-llm/src/mesh/heartbeat.rs
Original file line number Diff line number Diff line change
Expand Up @@ -671,7 +671,7 @@ impl Node {
send.write_all(&[STREAM_PEER_DOWN]).await?;
match protocol {
ControlProtocol::ProtoV1 => {
let proto_msg = crate::proto::node::PeerDown {
let proto_msg = crate::proto::mesh::PeerDown {
peer_id: bytes,
gen: NODE_PROTOCOL_GENERATION,
};
Expand Down Expand Up @@ -715,7 +715,7 @@ impl Node {
send.write_all(&[STREAM_PEER_LEAVING]).await?;
match protocol {
ControlProtocol::ProtoV1 => {
let proto_msg = crate::proto::node::PeerLeaving {
let proto_msg = crate::proto::mesh::PeerLeaving {
peer_id: bytes,
gen: NODE_PROTOCOL_GENERATION,
};
Expand Down
Loading
Loading