-
Notifications
You must be signed in to change notification settings - Fork 1
plan: space feature end-to-end happy path — single space, single task #991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
a4452c1
012f73c
be52a52
45666b3
9b89e68
4385150
f386ec2
95a2afd
74dffae
de84898
1ed0c47
966f3dd
02e948a
4a7ee93
3e0911b
7d07ad4
96564a0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,168 @@ | ||
| # Milestone 1: Gate Data Store and New Gate Types | ||
|
|
||
| ## Goal and Scope | ||
|
|
||
| Implement the core Gate + Channel architecture: gates with persistent data stores that agents can read/write, plus three new gate types (PR Gate, Aggregate Gate, enhanced Human Gate). This is the foundation that all other milestones build on. | ||
|
|
||
| ## Architecture | ||
|
|
||
| ### Gate Data Store | ||
|
|
||
| Every gate in the workflow has a persistent data store (JSON blob in SQLite). Agents interact with gate data via MCP tools. The gate's `evaluate()` method checks its own data store to determine if the gate passes. | ||
|
|
||
| ```typescript | ||
| // In packages/shared/src/types/space.ts | ||
| interface GateData { | ||
| [key: string]: unknown; | ||
| } | ||
|
|
||
| interface Gate { | ||
| id: string; | ||
| type: 'pr' | 'human' | 'aggregate' | 'task_result' | 'always'; | ||
| data: GateData; // persisted to SQLite | ||
| config: GateConfig; // static config (quorum count, etc.) | ||
| } | ||
|
|
||
| // Discriminated union for gate configs | ||
| type GateConfig = | ||
| | { type: 'always' } | ||
| | { type: 'task_result'; expression: string } | ||
| | { type: 'human' } | ||
| | { type: 'pr' } | ||
| | { type: 'aggregate'; quorum: number }; | ||
| ``` | ||
|
|
||
| ### Gate Evaluation | ||
|
|
||
| Each gate type has a simple evaluate function that checks its own data: | ||
|
|
||
| - **always**: Always passes → `true` | ||
| - **task_result**: `data.result === config.expression` | ||
| - **human**: `data.approved === true` | ||
| - **pr**: `data.prUrl != null && data.prUrl !== ''` | ||
| - **aggregate**: `Object.values(data.votes || {}).filter(v => v === 'approve').length >= config.quorum` | ||
|
|
||
| ## Tasks | ||
|
|
||
| ### Task 1.1: Extend Gate Types and Add Data Store Schema | ||
|
|
||
| **Description**: Extend the existing gate type system in `packages/shared/src/types/space.ts` to support the new gate types (`pr`, `aggregate`) and add the `data` field to the gate interface. Update the SQLite schema to persist gate data. | ||
|
|
||
| **Subtasks**: | ||
| 1. Audit the existing gate types in `packages/shared/src/types/space.ts` — currently supports `always`, `human`, `condition`, `task_result` | ||
| 2. Add new gate types: `pr` and `aggregate` | ||
| 3. Add `GateData` type and `data: GateData` field to the gate/channel interface | ||
| 4. Add `GateConfig` discriminated union with per-type config (e.g., `quorum` for aggregate) | ||
| 5. Create a dedicated `gate_data` table in SQLite keyed by `(run_id, gate_id)` with a JSON `data` column. This is preferred over a JSON column on the channel/gate record because: (a) gate data changes frequently during a run while channel definitions are static, (b) gate data is per-run while channels are per-workflow template, (c) a separate table allows atomic reads/writes without deserializing from the channel definition, and (d) concurrent writes (e.g., 3 reviewers voting) benefit from row-level granularity. | ||
| 6. Add `allowedWriterRoles: string[]` to the gate definition schema (static, per-gate, persisted alongside the gate type and config). Example: Plan PR Gate → `['planner']`, Aggregate Gate → `['reviewer']`. | ||
| 7. Add `failureReason` optional field to `SpaceWorkflowRun` interface: `failureReason?: 'humanRejected' | 'maxIterationsReached' | 'nodeTimeout' | 'agentCrash'`. All failure scenarios use the existing `'needs_attention'` status with this field, avoiding a cross-cutting `WorkflowRunStatus` type change. | ||
| 8. Add migration or schema update for existing databases | ||
| 9. Unit tests: type validation, schema creation, data persistence round-trip, gate_data table CRUD | ||
|
|
||
| **Acceptance Criteria**: | ||
| - Gate types include `pr`, `aggregate`, `human`, `task_result`, `always` | ||
| - Gate data is persisted to SQLite and survives daemon restart | ||
| - Gate config supports per-type configuration (quorum for aggregate) | ||
| - Unit tests verify persistence round-trip | ||
|
|
||
| **Depends on**: nothing | ||
|
|
||
| **Agent type**: coder | ||
|
|
||
| --- | ||
|
|
||
| ### Task 1.2: Implement Gate Evaluators for New Types | ||
|
|
||
| **Description**: Implement `evaluate()` logic for the new gate types (PR Gate, Aggregate Gate) and enhance the existing Human Gate evaluator to use the data store instead of workflow run config flags. | ||
|
|
||
| **Subtasks**: | ||
| 1. Refactor `ChannelGateEvaluator` to read from the gate's `data` store instead of workflow run config | ||
| 2. Implement PR Gate evaluator: passes when `data.prUrl` is non-empty | ||
| 3. Implement Aggregate Gate evaluator: passes when `Object.values(data.votes).filter(v => v === 'approve').length >= config.quorum` | ||
| 4. Update Human Gate evaluator: passes when `data.approved === true` (reads from gate data, not workflow run config) | ||
| 5. Ensure existing `task_result` and `always` evaluators still work | ||
| 6. Unit tests for each gate evaluator with various data states | ||
|
|
||
| **Acceptance Criteria**: | ||
| - PR Gate blocks until PR URL is written to gate data | ||
| - Aggregate Gate blocks until quorum is reached | ||
| - Human Gate reads from gate data store (not workflow run config flags) | ||
| - All existing gate types continue to work | ||
| - Unit tests cover pass/fail conditions for each gate type | ||
|
|
||
| **Depends on**: Task 1.1 | ||
|
|
||
| **Agent type**: coder | ||
|
|
||
| --- | ||
|
|
||
| ### Task 1.3: Implement `read_gate`, `write_gate`, and `list_gates` MCP Tools | ||
|
|
||
| **Description**: Create MCP tools that allow node agents to discover, read from, and write to gate data stores. These tools are added to the `node-agent-tools` MCP server so all workflow agents can interact with gates. | ||
|
|
||
| **Subtasks**: | ||
| 1. Add `list_gates` MCP tool to `node-agent-tools`: | ||
| - Parameters: none (uses the current workflow run context from the MCP server config) | ||
| - Returns: array of `{ gateId, type, description, allowedWriterRoles, currentData }` for all gates in the run | ||
| - Agents call this at session start to discover available gates and their IDs | ||
| 2. Add `read_gate` MCP tool to `node-agent-tools`: | ||
| - Parameters: `{ gateId: string }` | ||
| - Returns: the gate's current `data` object from the `gate_data` table | ||
| - Agents use this to read PR URLs, review votes, approval status, etc. | ||
| 3. Add `write_gate` MCP tool to `node-agent-tools`: | ||
| - Parameters: `{ gateId: string, data: Record<string, unknown> }` (merge semantics — new keys are added, existing keys are updated) | ||
| - **Authorization check**: reads the calling agent's `nodeRole` from the MCP server config, compares against the gate's `allowedWriterRoles` list (from gate definition). If unauthorized, returns error: `"Permission denied: role '{role}' cannot write to gate '{gateId}'"` | ||
| - Persists the updated data to the `gate_data` table in SQLite | ||
| - Triggers a re-evaluation of the gate (which may unblock the channel) | ||
| 4. Wire the tools into `TaskAgentManager` so they have access to the workflow run context (runId, gate definitions) | ||
| 5. **Workflow context injection**: When `TaskAgentManager.spawnSubSession()` creates a node agent, inject a `workflowContext` block into the task message containing: the node's upstream/downstream gate IDs, gate types, and human-readable descriptions (e.g., "code-pr-gate: write your PR URL here after creating the PR"). This provides gate IDs without requiring agents to call `list_gates` first. | ||
| 6. **Aggregate Gate vote keys**: Use `nodeId` (not `agentId`) as the vote key in `data.votes`. This prevents collision if an agent is re-spawned after a crash — the nodeId stays the same, and the re-spawned agent overwrites the previous vote cleanly. | ||
| 7. Unit tests: list_gates returns correct gates, read/write round-trip, permission enforcement (authorized + unauthorized), gate re-evaluation on write, vote key collision handling | ||
|
|
||
| **Acceptance Criteria**: | ||
| - Agents can discover gates via `list_gates` MCP tool (returns IDs, types, descriptions) | ||
| - Agents can read gate data via `read_gate` MCP tool | ||
| - Agents can write gate data via `write_gate` MCP tool | ||
| - Writing to a gate triggers re-evaluation (may unblock downstream channel) | ||
| - Permission model prevents unauthorized gate writes (returns clear error message) | ||
| - Workflow context injection provides gate IDs in the task message | ||
| - Aggregate Gate votes use nodeId as key (not agentId) | ||
| - Unit tests verify all tool behaviors including permission enforcement | ||
|
|
||
| **Depends on**: Task 1.2 | ||
|
|
||
| **Agent type**: coder | ||
|
|
||
| --- | ||
|
|
||
| ### Task 1.4: Integrate Gate Data Store with Channel Router | ||
|
|
||
| **Description**: Update the `ChannelRouter` to use the gate data store for routing decisions. When a gate's data changes (via `write_gate`), the router should re-evaluate the gate and potentially activate the next node. | ||
|
|
||
| **Subtasks**: | ||
| 1. Update `ChannelRouter.deliverMessage()` to call `gate.evaluate()` using the gate's data store | ||
| 2. Add a `onGateDataChanged(gateId)` method that triggers re-evaluation of the associated channel | ||
| 3. When a gate transitions from blocked → passed, activate the target node (call `TaskAgentManager.activateNode()`) | ||
| 4. Handle the Aggregate Gate case: multiple agents write to the same gate (3 reviewers voting). Each write triggers re-evaluation, but only the final vote that meets quorum unblocks the channel. | ||
| 5. **Implement gate data reset on cyclic traversal**: When the `ChannelRouter` traverses a cyclic channel (e.g., reviewer rejection → Coding, or QA failure → Coding), it must **reset the gate data of all downstream gates** between the cycle target (Coding) and the cycle source. Specifically: | ||
| - Aggregate Gate votes reset to `{ votes: {} }` — all 3 reviewers must re-vote from scratch | ||
| - Code PR Gate data is preserved (the PR URL doesn't change) | ||
| - Task Result Gate (QA) resets to `{}` | ||
| - The reset is atomic with the cyclic channel traversal (same transaction) | ||
| - This prevents stale approve votes from a previous round short-circuiting the re-review | ||
| 6. Ensure gate data changes are persisted before evaluation (no race conditions). Use SQLite transactions for atomic read-evaluate-write cycles. | ||
| 7. Handle concurrent writes to the same gate (e.g., 3 reviewers voting near-simultaneously): serialize writes via SQLite's write lock, re-evaluate after each write. | ||
| 8. Unit tests: gate transition triggers node activation, aggregate gate with incremental votes, concurrent write handling, **gate data reset on cyclic traversal** (verify votes cleared, verify PR data preserved) | ||
|
|
||
| **Acceptance Criteria**: | ||
| - Channel router uses gate data store for all routing decisions | ||
| - Gate data changes trigger re-evaluation and potential node activation | ||
| - Aggregate gate handles incremental votes correctly (only unblocks on quorum) | ||
| - **Cyclic channel traversal resets downstream gate data** (Aggregate votes cleared, PR data preserved) | ||
| - No race conditions between data persistence and evaluation (SQLite transactions) | ||
| - Concurrent writes to the same gate are serialized correctly | ||
| - Unit tests cover all routing scenarios including reset behavior | ||
|
|
||
| **Depends on**: Task 1.3 | ||
|
|
||
| **Agent type**: coder | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| # Milestone 2: Enhanced Node Agent Prompts | ||
|
|
||
| ## Goal and Scope | ||
|
|
||
| Upgrade the system prompts for planner, coder, reviewer, and QA node agents in the Space system. Prompts must include git workflow, PR management, review posting, and — critically — instructions for interacting with gate data stores via `list_gates`/`read_gate`/`write_gate` MCP tools. | ||
|
|
||
| **Dependency note**: M2 depends on both M1 (gate MCP tools) and M3 (V2 workflow template). Prompts reference specific gate IDs from the V2 workflow (e.g., "Code PR Gate", "Aggregate Gate"). **Implement M3 before M2** so the concrete gate IDs exist. The prompts use the gate IDs injected via workflow context (M1 Task 1.3 subtask 5) and reference them by the human-readable descriptions from the gate definitions. | ||
|
|
||
| **Gate discovery pattern**: All prompts include a standard preamble: "At session start, call `list_gates` to discover available gates and their IDs. Your task message also includes a `workflowContext` block with your upstream/downstream gate IDs." | ||
|
|
||
| ## Tasks | ||
|
|
||
| ### Task 2.1: Enhance Coder Node Agent System Prompt | ||
|
|
||
| **Description**: Update `buildCustomAgentSystemPrompt()` in `packages/daemon/src/lib/space/agents/custom-agent.ts` to include full git workflow instructions, PR creation, and gate data writing — mirroring the Room system's `buildCoderSystemPrompt()`. | ||
|
|
||
| **Subtasks**: | ||
| 1. Read `packages/daemon/src/lib/room/agents/coder-agent.ts` (`buildCoderSystemPrompt()`) and identify all prompt sections | ||
| 2. Add bypass markers section (RESEARCH_ONLY, VERIFICATION_COMPLETE, etc.) for role 'coder' | ||
| 3. Add review feedback handling: how to fetch GitHub reviews, verify feedback, push fixes | ||
| 4. Add PR creation flow with duplicate prevention (`gh pr list --head`) | ||
| 5. **Add gate interaction instructions**: After creating a PR, the coder must call `write_gate` to write PR data (`{ prUrl, prNumber, branch }`) to the Code PR Gate. This unblocks the reviewer channel. | ||
|
Owner
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. P2: Implicit dependency on M3Task 2.1 references "the Code PR Gate" and Task 2.3 references "the Aggregate Gate" -- but these gates only exist in the V2 workflow template defined in M3 (Task 3.1). If M2 is implemented first, the prompt builders would reference gate IDs that don't exist yet. Action: Either reorder M2 after M3, or add a note that M2 prompts use generic references (e.g., "the downstream gate for this step") rather than specific V2 gate names. M3 would then provide the concrete wiring. |
||
| 6. Add instructions for reading upstream gate data: the coder should call `read_gate` on the Plan PR Gate to understand the plan before coding | ||
|
|
||
| **Acceptance Criteria**: | ||
| - Coder agents produce same quality git/PR workflow as Room coder agents | ||
| - Coder writes PR data to gate after creating PR (triggers reviewer activation) | ||
| - Coder reads plan gate data to understand the plan | ||
| - Unit tests pass for updated prompt builder | ||
|
|
||
| **Depends on**: Milestone 1 (gate MCP tools) and Milestone 3 (V2 workflow template with concrete gate IDs) | ||
|
|
||
| **Agent type**: coder | ||
|
|
||
| --- | ||
|
|
||
| ### Task 2.2: Enhance Planner Node Agent System Prompt | ||
|
|
||
| **Description**: Create a specialized planner prompt that includes plan document creation, PR management, and gate data writing. | ||
|
|
||
| **Subtasks**: | ||
| 1. Add `buildPlannerNodeAgentPrompt()` in `custom-agent.ts` for role 'planner' | ||
| 2. Include plan document creation instructions (explore codebase, write plan, create PR) | ||
| 3. **Add gate interaction instructions**: After creating a plan PR, the planner must call `write_gate` to write PR data to the Plan PR Gate. This unblocks the plan review channel. | ||
| 4. Add instructions for `send_message` to communicate with plan reviewers | ||
| 5. Ensure the prompt works with `injectWorkflowContext` flag | ||
|
|
||
| **Acceptance Criteria**: | ||
| - Planner creates plan documents on feature branches with PRs | ||
| - Planner writes PR data to Plan PR Gate (triggers plan review activation) | ||
| - Unit tests cover the new prompt builder | ||
|
|
||
| **Depends on**: Milestone 1 (gate MCP tools) | ||
|
|
||
| **Agent type**: coder | ||
|
|
||
| --- | ||
|
|
||
| ### Task 2.3: Enhance Reviewer Node Agent System Prompt | ||
|
|
||
| **Description**: Create a specialized reviewer prompt for posting PR reviews with severity classification and writing votes to the Aggregate Gate. | ||
|
|
||
| **Subtasks**: | ||
| 1. Add `buildReviewerNodeAgentPrompt()` in `custom-agent.ts` for role 'reviewer' | ||
| 2. Include PR review process: read changed files, evaluate correctness/completeness/security | ||
| 3. Add review posting via REST API (`GH_PAGER=cat gh api repos/{owner}/{repo}/pulls/{pr}/reviews`) | ||
| 4. Add structured output format: `---REVIEW_POSTED---` block with URL, recommendation, severity counts | ||
| 5. **Add gate interaction instructions**: Reviewer reads the Code PR Gate (via `read_gate`) to find the PR URL, then after reviewing, writes its vote to the Aggregate Gate via `write_gate` using its **nodeId** as the vote key: `{ votes: { [nodeId]: 'approve' | 'reject' } }`. Using nodeId (not agentId) prevents collision if the agent is re-spawned after a crash. | ||
| 6. When 3 reviewers all write 'approve', the Aggregate Gate passes and QA is activated | ||
| 7. **Add edge case guidance**: Instruct the reviewer to check current vote state via `read_gate` on the Aggregate Gate before voting. If a reviewer is killed mid-review and re-spawned, it should check if it already voted and either update or confirm its vote. | ||
|
|
||
| **Acceptance Criteria**: | ||
| - Reviewer reads PR URL from gate data | ||
| - Reviewer posts proper PR reviews with severity classification | ||
| - Reviewer writes vote to Aggregate Gate | ||
| - Unit tests cover the prompt builder | ||
|
|
||
| **Depends on**: Milestone 1 (gate MCP tools) | ||
|
|
||
| **Agent type**: coder | ||
|
|
||
| --- | ||
|
|
||
| ### Task 2.4: Create QA Agent System Prompt | ||
|
|
||
| **Description**: Build a specialized system prompt for the QA agent that checks test coverage, runs tests, verifies CI status, and writes results to the Task Result Gate. | ||
|
|
||
| **Subtasks**: | ||
| 1. Add `buildQaNodeAgentPrompt()` in `custom-agent.ts` for role 'qa' | ||
| 2. Include instructions for: | ||
| - Test command detection (package.json scripts, Makefile targets, fallback commands) | ||
| - Checking CI status via `gh pr checks` or `gh pr view --json statusCheckRollup` | ||
| - Verifying PR mergeability via `gh pr view --json mergeable,mergeStateStatus` | ||
| - Checking for merge conflicts | ||
| 3. **Add gate interaction instructions**: QA reads the Code PR Gate to find the PR, then writes result to Task Result Gate via `write_gate({ result: 'passed' | 'failed', summary: '...' })` | ||
| 4. Include structured output format for QA results | ||
| 5. Add `gh` CLI auth verification instructions | ||
|
|
||
| **Acceptance Criteria**: | ||
| - QA agent has comprehensive verification prompt | ||
| - QA reads PR URL from gate data | ||
| - QA writes result to Task Result Gate | ||
| - Unit tests cover the prompt builder | ||
|
|
||
| **Depends on**: Milestone 1 (gate MCP tools) | ||
|
|
||
| **Agent type**: coder | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
P3: Stale PR description file list
The PR description lists files
01-enhanced-node-agent-prompts.mdthrough08-bug-fixes-and-hardening.md(old 8-milestone structure), but the actual PR contains01-gate-data-store.mdthrough09-bug-fixes-and-hardening.md(revised 9-milestone structure). The file list needs updating.