Skip to content
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
# Milestone 1: Unified Gate with Composable Conditions

## Goal and Scope

Implement the core Gate + Channel architecture: a single unified Gate entity with a persistent data store and composable conditions. No class hierarchy of gate types — one Gate concept, three condition types, one set of MCP tools. This is the foundation that all other milestones build on.

## Architecture

### The Unified Gate

Every gate in the workflow is the same entity. The only variation is the **condition config** — a small predicate that checks the gate's data store.

```typescript
// In packages/shared/src/types/space.ts
interface Gate {
id: string; // e.g., 'plan-pr-gate', 'review-votes-gate'
condition: GateCondition; // composable predicate — NOT a class hierarchy
data: Record<string, unknown>; // persistent data store (SQLite)
allowedWriterRoles: string[]; // who can write — ['planner'], ['reviewer'], etc.
description: string; // human-readable — injected into agent task messages
resetOnCycle: boolean; // whether data resets when a cyclic channel fires
}

// Three condition types cover ALL workflow behaviors
type GateCondition =
| { type: 'always' }
| { type: 'check'; field: string; op?: '==' | '!=' | 'exists'; value?: unknown }
| { type: 'count'; field: string; matchValue: unknown; min: number }
```

### Condition Evaluation

One `evaluate(gate)` function with a switch on `condition.type`:

- **`always`**: Returns `true`.
- **`check`**: Checks a single field in `gate.data`.
- `op: 'exists'` (default if no `op`): `data[field] != null && data[field] !== ''`
- `op: '=='`: `data[field] === value`
- `op: '!='`: `data[field] !== value`
- **`count`**: Counts entries in a map field that match a value.
- `Object.values(data[field] || {}).filter(v => v === matchValue).length >= min`

### How Each Workflow Gate Maps to Conditions

| Gate ID | Condition Config | Passes when... |
|---------|-----------------|----------------|
| `plan-pr-gate` | `{ type: 'check', field: 'prUrl' }` | Planner writes PR URL |
| `plan-approval-gate` | `{ type: 'check', field: 'approved', op: '==', value: true }` | Human approves |
| `code-pr-gate` | `{ type: 'check', field: 'prUrl' }` | Coder writes PR URL |
| `review-votes-gate` | `{ type: 'count', field: 'votes', matchValue: 'approve', min: 3 }` | ≥3 reviewers approve |
| `review-reject-gate` | `{ type: 'check', field: 'result', op: '==', value: 'rejected' }` | Any reviewer rejects |
| `qa-result-gate` | `{ type: 'check', field: 'result', op: '==', value: 'passed' }` | QA passes |
| `qa-fail-gate` | `{ type: 'check', field: 'result', op: '==', value: 'failed' }` | QA fails |

**Note**: These are all the same Gate entity with different condition configs. No `PRGate`, `AggregateGate`, `HumanGate` classes.

## Tasks

### Task 1.1: Implement Unified Gate Type and Data Store Schema

**Description**: Replace the existing separate gate type system in `packages/shared/src/types/space.ts` with the unified `Gate` interface. Add the `gate_data` SQLite table for persistent data stores.

**Subtasks**:
1. Audit the existing gate types in `packages/shared/src/types/space.ts` — currently supports `always`, `human`, `condition`, `task_result` as separate types
2. Replace with the unified `Gate` interface: `{ id, condition: GateCondition, data, allowedWriterRoles, description, resetOnCycle }`
3. Define the `GateCondition` discriminated union with three types: `always`, `check`, `count`
4. Create a dedicated `gate_data` table in SQLite keyed by `(run_id, gate_id)` with a JSON `data` column. Rationale: (a) gate data changes frequently during a run while gate definitions are static, (b) gate data is per-run while gate definitions are per-workflow template, (c) separate table enables atomic reads/writes without JSON blob deserialization, (d) concurrent writes (e.g., 3 reviewers voting) benefit from row-level granularity.
5. Add `allowedWriterRoles: string[]` to the gate definition schema (static, per-gate)
6. Add `resetOnCycle: boolean` to the gate definition schema — controls whether data is cleared on cyclic channel traversal
7. Add `failureReason` optional field to `SpaceWorkflowRun` interface: `failureReason?: 'humanRejected' | 'maxIterationsReached' | 'nodeTimeout' | 'agentCrash'`. All failure scenarios use existing `'needs_attention'` status with this field.
8. Migrate existing gate definitions to the new unified format (backward-compatible: map old `human` type to `{ type: 'check', field: 'approved', op: '==', value: true }`, etc.)
9. Unit tests: type validation, schema creation, data persistence round-trip, gate_data table CRUD, backward-compatible migration

**Acceptance Criteria**:
- Single unified `Gate` interface replaces all separate gate types
- Three condition types (`always`, `check`, `count`) cover all workflow behaviors
- Gate data persisted to SQLite `gate_data` table and survives daemon restart
- Existing gate definitions are migrated to unified format
- Unit tests verify persistence round-trip and migration

**Depends on**: nothing

**Agent type**: coder

---

### Task 1.2: Implement Unified Gate Evaluator

**Description**: Implement a single `evaluate(gate)` function that handles all three condition types. Replace the existing per-type evaluator logic in `ChannelGateEvaluator`.

**Subtasks**:
1. Create `evaluateGate(gate: Gate): boolean` function:
- Switch on `gate.condition.type`
- `always` → return `true`
- `check` → read `gate.data[field]`, apply op (`exists`, `==`, `!=`)
- `count` → read `gate.data[field]` as a map, count values matching `matchValue`, check `>= min`
2. Refactor `ChannelGateEvaluator` to call `evaluateGate()` instead of per-type logic
3. Ensure the evaluator reads from the gate's `data` store (from `gate_data` table), not from workflow run config
4. Handle edge cases: missing field → `check` with `exists` returns false; missing map field → `count` returns 0
5. Remove the old per-type evaluator code paths (`human`, `pr`, `aggregate`, `task_result` as separate branches)
6. Unit tests for each condition type with various data states, including edge cases (null data, empty map, missing field)

**Acceptance Criteria**:
- Single `evaluateGate()` function handles all conditions
- No separate evaluator per gate type — one code path with a 3-way switch
- All existing gate behaviors continue to work (verified by backward-compat tests)
- Unit tests cover all condition types and edge cases

**Depends on**: Task 1.1

**Agent type**: coder

---

### Task 1.3: Implement `read_gate`, `write_gate`, and `list_gates` MCP Tools

**Description**: Create MCP tools that allow node agents to discover, read from, and write to gate data stores. These tools are added to the `node-agent-tools` MCP server. All gates use the same tools — no type-specific APIs.

**Subtasks**:
1. Add `list_gates` MCP tool to `node-agent-tools`:
- Parameters: none (uses the current workflow run context from the MCP server config)
- Returns: array of `{ gateId, condition, description, allowedWriterRoles, currentData }` for all gates in the run
- Agents call this at session start to discover available gates
2. Add `read_gate` MCP tool to `node-agent-tools`:
- Parameters: `{ gateId: string }`
- Returns: the gate's current `data` object from the `gate_data` table
3. Add `write_gate` MCP tool to `node-agent-tools`:
- Parameters: `{ gateId: string, data: Record<string, unknown> }` (merge semantics — new keys added, existing keys updated)
- **Authorization check**: reads calling agent's `nodeRole` from MCP server config, compares against gate's `allowedWriterRoles`. Unauthorized → error: `"Permission denied: role '{role}' cannot write to gate '{gateId}'"`
- Persists updated data to `gate_data` table
- Triggers gate re-evaluation (may unblock channel)
4. Wire tools into `TaskAgentManager` with workflow run context (runId, gate definitions)
5. **Workflow context injection**: When `TaskAgentManager.spawnSubSession()` creates a node agent, inject `workflowContext` into the task message containing: upstream/downstream gate IDs, condition descriptions, and human-readable instructions (e.g., "code-pr-gate: write your PR URL here after creating the PR")
6. **Vote keys**: For gates using `count` condition (vote counting), use `nodeId` (not `agentId`) as the map key. Prevents collision if an agent is re-spawned after a crash.
7. Unit tests: list_gates, read/write round-trip, permission enforcement, gate re-evaluation on write, vote key collision handling

**Acceptance Criteria**:
- All gates use the same `read_gate`/`write_gate`/`list_gates` tools — no type-specific APIs
- Writing to a gate triggers re-evaluation (may unblock downstream channel)
- Permission model prevents unauthorized writes (clear error message)
- Workflow context injection provides gate IDs in task message
- Unit tests verify all tool behaviors

**Depends on**: Task 1.2

**Agent type**: coder

---

### Task 1.4: Integrate Unified Gate with Channel Router

**Description**: Update the `ChannelRouter` to use the unified gate's data store and `evaluateGate()` for routing decisions. Implement gate data reset on cyclic traversal using the `resetOnCycle` flag.

**Subtasks**:
1. Update `ChannelRouter.deliverMessage()` to call `evaluateGate(gate)` using the gate's data store
2. Add `onGateDataChanged(gateId)` method that triggers re-evaluation of the associated channel
3. When a gate transitions from blocked → passed, activate the target node
4. Handle vote-counting gates: multiple agents write to the same gate. Each write triggers re-evaluation, but only the final vote meeting the `min` threshold unblocks the channel.
5. **Implement `resetOnCycle` behavior**: When the `ChannelRouter` traverses a cyclic channel, reset the `data` to `{}` for all downstream gates where `resetOnCycle === true`. Specifically in the V2 workflow:
- `review-votes-gate` (`resetOnCycle: true`) → resets to `{}` — all 3 reviewers must re-vote
- `review-reject-gate` (`resetOnCycle: true`) → resets to `{}`
- `qa-result-gate` (`resetOnCycle: true`) → resets to `{}`
- `qa-fail-gate` (`resetOnCycle: true`) → resets to `{}`
- `code-pr-gate` (`resetOnCycle: false`) → **preserved** (PR URL doesn't change)
- The reset is atomic with the cyclic traversal (same SQLite transaction)
6. Ensure gate data changes are persisted before evaluation (SQLite transactions for atomic read-evaluate-write)
7. Handle concurrent writes (e.g., 3 reviewers voting simultaneously): serialize via SQLite write lock, re-evaluate after each write
8. Unit tests: gate transition triggers node activation, vote-counting gate with incremental writes, concurrent write handling, **resetOnCycle behavior** (verify data cleared for resetOnCycle:true, preserved for resetOnCycle:false)

**Acceptance Criteria**:
- Channel router uses unified `evaluateGate()` for all routing
- Gate data changes trigger re-evaluation and potential node activation
- Vote-counting gates handle incremental writes correctly
- `resetOnCycle` flag controls which gates are cleared on cyclic traversal
- No race conditions (SQLite transactions)
- Concurrent writes serialized correctly
- Unit tests cover all scenarios including reset behavior

**Depends on**: Task 1.3

**Agent type**: coder
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Milestone 2: Enhanced Node Agent Prompts

## Goal and Scope

Upgrade the system prompts for planner, coder, reviewer, and QA node agents in the Space system. Prompts must include git workflow, PR management, review posting, and — critically — instructions for interacting with gate data stores via `list_gates`/`read_gate`/`write_gate` MCP tools.

**Dependency note**: M2 depends on both M1 (unified gate with MCP tools) and M3 (V2 workflow template). Prompts reference specific gate IDs from the V2 workflow (e.g., `code-pr-gate`, `review-votes-gate`). **Implement M3 before M2** so the concrete gate IDs exist. The prompts use the gate IDs injected via workflow context (M1 Task 1.3 subtask 5) and reference them by the `description` field from the gate definitions. Since all gates use the same `read_gate`/`write_gate` tools, prompts don't need type-specific instructions — just "write data to gate X".

**Gate discovery pattern**: All prompts include a standard preamble: "At session start, call `list_gates` to discover available gates and their IDs. Your task message also includes a `workflowContext` block with your upstream/downstream gate IDs."

## Tasks

### Task 2.1: Enhance Coder Node Agent System Prompt

**Description**: Update `buildCustomAgentSystemPrompt()` in `packages/daemon/src/lib/space/agents/custom-agent.ts` to include full git workflow instructions, PR creation, and gate data writing — mirroring the Room system's `buildCoderSystemPrompt()`.

**Subtasks**:
1. Read `packages/daemon/src/lib/room/agents/coder-agent.ts` (`buildCoderSystemPrompt()`) and identify all prompt sections
2. Add bypass markers section (RESEARCH_ONLY, VERIFICATION_COMPLETE, etc.) for role 'coder'
3. Add review feedback handling: how to fetch GitHub reviews, verify feedback, push fixes
4. Add PR creation flow with duplicate prevention (`gh pr list --head`)
5. **Add gate interaction instructions**: After creating a PR, the coder must call `write_gate` on `code-pr-gate` to write PR data (`{ prUrl, prNumber, branch }`). The gate's `check: prUrl exists` condition then passes, unblocking the reviewer channel. Same `write_gate` tool as every other gate — no type-specific API.
6. Add instructions for reading upstream gate data: the coder should call `read_gate` on `plan-pr-gate` to understand the plan before coding

**Acceptance Criteria**:
- Coder agents produce same quality git/PR workflow as Room coder agents
- Coder writes PR data to gate after creating PR (triggers reviewer activation)
- Coder reads plan gate data to understand the plan
- Unit tests pass for updated prompt builder

**Depends on**: Milestone 1 (gate MCP tools) and Milestone 3 (V2 workflow template with concrete gate IDs)

**Agent type**: coder

---

### Task 2.2: Enhance Planner Node Agent System Prompt

**Description**: Create a specialized planner prompt that includes plan document creation, PR management, and gate data writing.

**Subtasks**:
1. Add `buildPlannerNodeAgentPrompt()` in `custom-agent.ts` for role 'planner'
2. Include plan document creation instructions (explore codebase, write plan, create PR)
3. **Add gate interaction instructions**: After creating a plan PR, the planner must call `write_gate` on `plan-pr-gate` to write PR data (`{ prUrl, prNumber, branch }`). The gate's `check: prUrl exists` condition then passes, unblocking the plan review channel.
4. Add instructions for `send_message` to communicate with plan reviewers
5. Ensure the prompt works with `injectWorkflowContext` flag

**Acceptance Criteria**:
- Planner creates plan documents on feature branches with PRs
- Planner writes PR data to Plan PR Gate (triggers plan review activation)
- Unit tests cover the new prompt builder

**Depends on**: Milestone 1 (gate MCP tools)

**Agent type**: coder

---

### Task 2.3: Enhance Reviewer Node Agent System Prompt

**Description**: Create a specialized reviewer prompt for posting PR reviews with severity classification and writing votes to the `review-votes-gate`.

**Subtasks**:
1. Add `buildReviewerNodeAgentPrompt()` in `custom-agent.ts` for role 'reviewer'
2. Include PR review process: read changed files, evaluate correctness/completeness/security
3. Add review posting via REST API (`GH_PAGER=cat gh api repos/{owner}/{repo}/pulls/{pr}/reviews`)
4. Add structured output format: `---REVIEW_POSTED---` block with URL, recommendation, severity counts
5. **Add gate interaction instructions**: Reviewer reads `code-pr-gate` (via `read_gate`) to find the PR URL, then after reviewing, writes its vote to `review-votes-gate` via `write_gate` using its **nodeId** as the vote key: `{ votes: { [nodeId]: 'approve' | 'reject' } }`. Using nodeId (not agentId) prevents collision on re-spawn. The gate's `count: votes.approve >= 3` condition evaluates after each write.
6. When 3 reviewers all write 'approve', the `review-votes-gate` condition passes and QA is activated
7. **Add edge case guidance**: Instruct the reviewer to check current vote state via `read_gate` on `review-votes-gate` before voting. If re-spawned, check if already voted and update/confirm.

**Acceptance Criteria**:
- Reviewer reads PR URL from gate data
- Reviewer posts proper PR reviews with severity classification
- Reviewer writes vote to `review-votes-gate`
- Unit tests cover the prompt builder

**Depends on**: Milestone 1 (gate MCP tools)

**Agent type**: coder

---

### Task 2.4: Create QA Agent System Prompt

**Description**: Build a specialized system prompt for the QA agent that checks test coverage, runs tests, verifies CI status, and writes results to `qa-result-gate`.

**Subtasks**:
1. Add `buildQaNodeAgentPrompt()` in `custom-agent.ts` for role 'qa'
2. Include instructions for:
- Test command detection (package.json scripts, Makefile targets, fallback commands)
- Checking CI status via `gh pr checks` or `gh pr view --json statusCheckRollup`
- Verifying PR mergeability via `gh pr view --json mergeable,mergeStateStatus`
- Checking for merge conflicts
3. **Add gate interaction instructions**: QA reads `code-pr-gate` to find the PR, then writes result to `qa-result-gate` via `write_gate({ result: 'passed' | 'failed', summary: '...' })`. The gate's `check: result == passed` condition evaluates after the write.
4. Include structured output format for QA results
5. Add `gh` CLI auth verification instructions

**Acceptance Criteria**:
- QA agent has comprehensive verification prompt
- QA reads PR URL from gate data
- QA writes result to `qa-result-gate`
- Unit tests cover the prompt builder

**Depends on**: Milestone 1 (gate MCP tools)

**Agent type**: coder
Loading
Loading