Product Roadmap

Agent Brain Product Roadmap

Version: 1.2.0 Last Updated: 2026-02-01 Status: Active

Vision

Agent Brain is a local-first RAG (Retrieval-Augmented Generation) service that indexes documentation and source code, providing intelligent semantic search via API for CLI tools and Claude integration. The core principles are:

Privacy First: Runs entirely on your machine with disk persistence
High Retrieval Quality: Vector + keyword + hybrid search strategies
Operational Ergonomics: CLI-first management experience
Future-Proof Flexibility: Built on LlamaIndex abstractions
Phased Delivery: Incremental value with each release

Phase Summary

Phase	Name	Spec ID	Status	Priority	Transport
1	Core Document RAG	001-005	COMPLETED	-	HTTP
2	BM25 & Hybrid Retrieval	100	COMPLETED	P1	HTTP
3	Source Code Ingestion	101	COMPLETED	P2	HTTP
3.1	Multi-Instance Architecture	109	COMPLETED	P1	HTTP
3.2	C# Code Indexing	110	COMPLETED	P2	HTTP
3.3	Skill Instance Discovery	111	COMPLETED	P2	HTTP
3.4	Agent Brain Naming	112	COMPLETED	P1	HTTP
3.5	GraphRAG Integration	113	COMPLETED	P2	HTTP
3.6	Agent Brain Plugin	114	COMPLETED	P2	HTTP
4	UDS & Claude Plugin Evolution	102	Future	P3	HTTP + UDS
5	Pluggable Model Providers	103	Next	P3	HTTP + UDS
6	PostgreSQL/AlloyDB Backend	104	Future	P4	HTTP + UDS
7	AWS Bedrock Provider	105	Future	P4	HTTP + UDS
8	Google Vertex AI Provider	106	Future	P4	HTTP + UDS

Phase 1: Core Document RAG

Status: COMPLETED Spec Directory: specs/001-005

Delivered Capabilities

Document Ingestion: PDF + Markdown (.md, .mdx) support
Context-Enriched Chunking: Section/heading-aware chunking with Claude summarization
Vector Search: Chroma vector database with OpenAI text-embedding-3-large
Persistent Storage: Disk-based persistence across restarts
REST API: /query, /index, /health endpoints
CLI Tool: agent-brain with status, query, index, reset commands
Claude Skill: Basic integration for conversational document search

Technical Stack

FastAPI + Uvicorn server
LlamaIndex for document processing
ChromaDB for vector storage
OpenAI embeddings + Claude Haiku summarization

Phase 2: BM25 & Hybrid Retrieval

Status: NEXT Spec Directory: specs/100-bm25-hybrid-retrieval Transport: HTTP only

Scope

Add classic keyword search (BM25) alongside vector search, with hybrid retrieval combining both strategies using Reciprocal Rank Fusion (RRF).

Key Features

BM25 Retriever: Classic full-text keyword search
Hybrid Retrieval: Combined vector + BM25 with configurable fusion
Retrieval Mode Selection: API parameter for vector, bm25, or hybrid (default)
Alpha Tuning: Configurable weight between vector and keyword scores
Enhanced Scoring Metadata: Detailed score breakdowns in responses

API Changes

POST /query
{
  "query": "search text",
  "mode": "hybrid",    // "vector" | "bm25" | "hybrid"
  "alpha": 0.5,        // fusion weight (0=BM25 only, 1=vector only)
  "top_k": 10
}

Benefits

Better precision for exact term matching (function names, error codes)
Improved recall through combined retrieval strategies
User control over search behavior

Phase 3: Source Code Ingestion & Unified Corpus

Status: Planned Spec Directory: specs/101-code-ingestion Transport: HTTP

Scope

Enable indexing of source code alongside documentation for unified corpus searches. This is critical for the book generation and corpus use cases.

Key Features

Code Ingestion via CodeSplitter:
- Python (.py)
- TypeScript/JavaScript (.ts, .tsx, .js, .jsx)
Unified Indexing: Vector + BM25 across documents and code
Code Summaries: SummaryExtractor generates natural language descriptions per code chunk
Extended Filters: source_type (doc vs code), language filters in queries
AST-Aware Chunking: Preserves function/class boundaries

API Changes

POST /index
{
  "folder_path": "/path/to/project",
  "include_code": true,
  "languages": ["python", "typescript"]
}

POST /query
{
  "query": "authentication handler",
  "source_type": "code",    // "doc" | "code" | "all"
  "language": "python"
}

Benefits

Single search across documentation and implementation
Code context improves answer quality for technical queries
Enables corpus-based book/tutorial generation

Phase 3.1: Multi-Instance Architecture

Status: COMPLETED Spec Directory: .speckit/features/109-multi-instance-architecture/ Transport: HTTP

Scope

Enable running multiple concurrent Agent Brain instances with per-project isolation, automatic port allocation, and runtime discovery for agent integration.

Key Features

Per-Project Isolation: Each project gets its own server instance with isolated indexes
Auto-Port Allocation: OS-assigned ports prevent conflicts between projects
State Directory: Project state stored in .claude/doc-serve/
Runtime Discovery: runtime.json enables agents and skills to find running instances
Lock File Protocol: Prevents double-start of instances
Project Root Resolution: Consistent detection via git or marker files

New CLI Commands

agent-brain init              # Initialize project for Agent Brain
agent-brain start --daemon    # Start server with auto-port
agent-brain stop              # Stop the server
agent-brain list              # List all running instances

Benefits

Work on multiple projects simultaneously
No port conflicts between projects
State travels with the project (can be version-controlled)
Agents can automatically discover running servers

Phase 3.2: C# Code Indexing

Status: COMPLETED Spec Directory: .speckit/features/110-csharp-code-indexing/ Transport: HTTP

Scope

Add C# language support to the code ingestion pipeline with AST-aware parsing.

Key Features

File Extensions: .cs and .csx (C# scripts)
AST-Aware Chunking: Classes, methods, interfaces, properties, enums
XML Documentation: Extracts /// <summary> comments as metadata
Language Filter: Query with --languages csharp

Benefits

Full .NET ecosystem support
Semantic search across C# codebases
Rich metadata extraction for better search quality

Phase 3.3: Skill Instance Discovery

Status: IN-PROGRESS Spec Directory: .speckit/features/111-skill-instance-discovery/ Transport: HTTP

Scope

Update the Agent Brain Claude Code skill to leverage multi-instance architecture for automatic server discovery and lifecycle management.

Key Features

Auto-Initialization: Skill automatically runs agent-brain init when needed
Server Discovery: Reads runtime.json to find running instances
Auto-Start: Starts server automatically if no instance is running
Status Reporting: Reports port, mode, instance ID, document count
Cross-Agent Sharing: Multiple agents share the same server instance

Benefits

Zero-configuration skill usage
No manual server management required
Seamless multi-agent workflows

Phase 3.4: Agent Brain Naming

Status: COMPLETED Spec Directory: .speckit/features/112-agent-brain-naming/ Transport: HTTP

Scope

Unify branding across all packages from "doc-serve" to "agent-brain" for consistent identity and discoverability.

Key Features

Package Renaming: doc-serve-rag → agent-brain-rag, doc-serve-cli → agent-brain-cli
Command Renaming: doc-serve CLI → agent-brain CLI
Backward Compatibility: Legacy aliases maintained for transition period
Documentation Updates: All references updated to new naming

Benefits

Consistent branding across ecosystem
Improved discoverability
Better alignment with AI agent workflows

Phase 3.5: GraphRAG Integration

Status: COMPLETED Spec Directory: .speckit/features/113-graphrag-integration/ Transport: HTTP

Scope

Add knowledge graph extraction alongside vector search for enhanced entity-centric retrieval and relationship queries.

Key Features

Entity Extraction: Named entities, concepts, and relationships extracted from documents
Property Graph Store: LlamaIndex PropertyGraphStore with configurable backends
Graph-Enhanced Retrieval: Entity disambiguation and relationship traversal
Hybrid Mode: Combines graph context with vector similarity for richer results
Query Modes: vector, graph, hybrid (default)

API Changes

POST /query
{
  "query": "search text",
  "mode": "hybrid",    // "vector" | "graph" | "hybrid"
  "include_graph": true,
  "top_k": 10
}

Benefits

Better handling of entity-centric queries
Relationship discovery across documents
Improved context for complex questions

Phase 3.6: Agent Brain Plugin

Status: COMPLETED Spec Directory: .speckit/features/114-agent-brain-plugin/ Transport: HTTP

Scope

Claude Code plugin providing commands, agents, and skills for seamless Agent Brain integration within Claude Code workflows.

Key Features

Slash Commands:
- /agent-brain:search - Semantic search across indexed content
- /agent-brain:status - Check server status and document count
- /agent-brain:index - Index documents or code
Specialized Agents:
- agent-brain-researcher - Multi-step research with citations
- agent-brain-indexer - Automated indexing workflows
Skill Integration: Updated skill with auto-discovery and lifecycle management

Installation

# Install plugin
claude plugins add agent-brain-plugin

# Or from local path
claude plugins add ./agent-brain-plugin

Benefits

Native Claude Code integration
Autonomous research capabilities
Simplified server lifecycle management

Phase 4: UDS Transport & Claude Plugin Evolution

Status: Future Spec Directory: specs/102-uds-claude-plugin Transport: HTTP + optional UDS

Scope

Add Unix Domain Socket (UDS) transport for lower-latency local communication and evolve the Claude plugin with richer capabilities.

Key Features

UDS Transport: Optional Unix Domain Socket alongside HTTP
Rich Slash Commands:
- /search - General semantic search
- /doc - Documentation-only search
- /code - Code-only search
Server Lifecycle Management: Plugin auto-starts/stops server
Multi-Step Research Agent: Break down complex questions

Benefits

~10-50x latency improvement for local queries via UDS
More intuitive Claude interaction patterns
Autonomous research capabilities

Phase 5: Pluggable Model Providers

Status: Future Spec Directory: specs/103-pluggable-providers Transport: HTTP + UDS

Scope

Full configuration-driven model selection using LlamaIndex abstractions. No code changes required to switch providers.

Supported Embedding Providers

Provider	Models	Notes
OpenAI	text-embedding-3-small/large, ada-002	Default
Ollama	nomic-embed-text, bge, etc.	Local, offline
Cohere	embed-english-v3, embed-multilingual-v3	Via API

Note: Grok and Gemini currently lack public embedding APIs and are not supported for embeddings.

Supported Summarization/LLM Providers

Provider	Models	Notes
Anthropic	Claude Haiku, Sonnet, Opus	Default
OpenAI	GPT-4o, GPT-4o-mini	Via API
Gemini	Flash, Pro	Via API
Grok	Via OpenAI-compatible endpoint	Via API
Ollama	Llama 3, Mistral, Qwen	Local, offline

Configuration Example

# config.yaml
embedding:
  provider: ollama
  model: nomic-embed-text
  params:
    base_url: http://localhost:11434

summarization:
  provider: anthropic
  model: claude-3-5-sonnet-20241022
  params:
    api_key_env: ANTHROPIC_API_KEY

Benefits

Run completely offline with Ollama
Cost optimization with different providers
Enterprise flexibility

Phase 6: PostgreSQL/AlloyDB Backend

Status: Future Spec Directory: specs/104-postgresql-backend Transport: HTTP + UDS

Scope

Optional configuration-driven switch to PostgreSQL (local) or AlloyDB (managed/cloud) as persistent storage backend.

Key Features

pgvector Extension: High-performance vector similarity search
AlloyDB ScaNN Indexes: Google Cloud's optimized vector indexing
Native Full-Text Search: PostgreSQL tsvector/tsquery (BM25-like)
Hybrid Retrieval: hybrid_search=True on PGVectorStore
JSONB Metadata: Rich metadata with GIN indexes

Benefits

Superior scalability for very large corpora
Built-in replication/backup
Transactional consistency
Mature PostgreSQL full-text engine replaces custom BM25

Migration Path

Install llama-index-vector-stores-postgres
Update config.yaml with PostgreSQL connection
Re-index documents (one-time migration)

Phase 7: AWS Bedrock Provider Support

Status: Future Spec Directory: specs/105-aws-bedrock Transport: HTTP + UDS

Scope

Add full support for AWS Bedrock as a pluggable provider for both embeddings and summarization/completion LLMs.

Supported Models

Embeddings:

Amazon Titan Embed Text v1/v2
Cohere Embed English/Multilingual v3

Summarization/LLM:

Claude (via Bedrock)
Titan Text
Meta Llama
Mistral
Cohere Command

Configuration

embedding:
  provider: bedrock
  model: amazon.titan-embed-text-v2
  params:
    region: us-east-1

summarization:
  provider: bedrock
  model: anthropic.claude-3-sonnet

Authentication

Default AWS credentials chain
Profile-based authentication
Explicit keys/region configuration

Benefits

Enterprise-grade security/compliance
Access to high-performance AWS-hosted models
Cost optimization for AWS users

Phase 8: Google Vertex AI Provider Support

Status: Future Spec Directory: specs/106-vertex-ai Transport: HTTP + UDS

Scope

Add full support for Google Vertex AI as a pluggable provider for embeddings and summarization/completion LLMs.

Supported Models

Embeddings:

textembedding-gecko
multimodalembedding

Summarization/LLM:

Gemini 1.5 Flash/Pro
gemini-1.0-pro

Configuration

embedding:
  provider: vertex
  model: textembedding-gecko@003
  params:
    project: my-gcp-project
    location: us-central1

summarization:
  provider: vertex
  model: gemini-3-flash

Authentication

Service account JSON
Application Default Credentials (ADC)
Explicit project/location

Benefits

Integration with Google Cloud ecosystem
Strong multimodal capabilities with Gemini
Enterprise features for GCP users

Corpus/Book Generation Use Cases

Agent Brain enables creating searchable, AI-queryable corpora from large documentation sets, codebases, or book collections. This is a key differentiator for technical content development.

Use Case 1: AWS CDK Documentation Corpus

Problem: AWS CDK has extensive documentation across multiple languages and services. Developers need quick access to specific patterns and configurations.

Solution with Agent Brain:

# Index AWS CDK documentation
agent-brain index ~/aws-cdk-docs/

# Index AWS CDK Python library source (Phase 3)
agent-brain index ~/aws-cdk-python/src/ --include-code

# Query during development
agent-brain query "S3 bucket with lifecycle rules and versioning"

Sample Queries:

"How to create a Lambda function with VPC access?"
"DynamoDB table with global secondary index pattern"
"Cross-stack references best practices"
"EventBridge rule for S3 object creation"

Book Generation Application:

Index AWS CDK source + comprehensive PDF guide
Create corpus for writing AWS CDK tutorials
Claude can cite specific documentation and code examples

Use Case 2: Claude/Anthropic Documentation Corpus

Problem: Building AI applications requires referencing Claude API docs, SDK documentation, and best practices frequently.

Solution with Agent Brain:

# Index Claude documentation
agent-brain index ~/anthropic-docs/

# Index Claude SDK source (Phase 3)
agent-brain index ~/claude-sdk/src/ --include-code

# Query via Claude skill
"How do I implement streaming responses with the Python SDK?"

Sample Queries:

"Claude tool use best practices"
"Prompt caching implementation"
"Error handling for rate limits"
"Vision API usage patterns"

Skill Generation Application:

Index Claude Code documentation
Create corpus for writing Claude Code skills
Skills can reference authoritative source material

Use Case 3: Internal Technical Manual

Problem: Large organizations have extensive internal documentation that teams need to search effectively for onboarding and reference.

Solution with Agent Brain:

# Index internal documentation
agent-brain index ~/company-docs/architecture/
agent-brain index ~/company-docs/onboarding/
agent-brain index ~/company-docs/api-guides/

# Index internal code (Phase 3)
agent-brain index ~/company-monorepo/libs/ --include-code

# Team members query via Claude skill
"What is our authentication flow for mobile apps?"

Benefits:

New team members find answers faster
Reduced dependency on tribal knowledge
Consistent answers across the organization

Use Case 4: Open Source Project Contribution

Problem: Contributing to large open source projects requires understanding existing patterns, conventions, and implementation details.

Solution with Agent Brain:

# Index project documentation
agent-brain index ~/kubernetes/docs/

# Index project source code (Phase 3)
agent-brain index ~/kubernetes/pkg/ --include-code

# Query for contribution patterns
agent-brain query "How are admission controllers implemented?"

Sample Queries:

"Pod scheduling algorithm"
"Custom resource definition validation"
"Controller reconciliation loop pattern"

Benefits:

Faster ramp-up for contributors
Understand conventions before submitting PRs
Find similar implementations to reference

Recommended Next Actions

Implement Phase 2 (BM25 + Hybrid Retrieval) - Immediate retrieval quality improvements
Proceed to Phase 3 (Code Ingestion) - Enables unified corpus searches
Phase 5 for Local-Only - Run completely offline with Ollama
Phase 6 for Scale - PostgreSQL backend for large corpora

Model provider flexibility (Phases 5, 7, 8) and PostgreSQL backend (Phase 6) are cleanly deferred until the core retrieval enhancements are stable.

Appendix: Numbering Convention

001-099: Phase 1 specs (existing, completed)
100-199: Roadmap phases 2-8 (future development)
200+: Reserved for future expansion

See docs/roadmaps/spec-mapping.md for full phase-to-spec mapping.

Navigation

Getting Started

Architecture

Design Docs

Search Guides

Provider Config

Deployment

API

Plugin Commands

Search

Server

Setup

Plugin Agents

Features

Planning

Product-Roadmap

Product Roadmap

Agent Brain Product Roadmap

Vision

Phase Summary

Phase 1: Core Document RAG

Delivered Capabilities

Technical Stack

Phase 2: BM25 & Hybrid Retrieval

Scope

Key Features

API Changes

Benefits

Phase 3: Source Code Ingestion & Unified Corpus

Scope

Key Features

API Changes

Benefits

Phase 3.1: Multi-Instance Architecture

Scope

Key Features

New CLI Commands

Benefits

Phase 3.2: C# Code Indexing

Scope

Key Features

Benefits

Phase 3.3: Skill Instance Discovery

Scope

Key Features

Benefits

Phase 3.4: Agent Brain Naming

Scope

Key Features

Benefits

Phase 3.5: GraphRAG Integration

Scope

Key Features

API Changes

Benefits

Phase 3.6: Agent Brain Plugin

Scope

Key Features

Installation

Benefits

Phase 4: UDS Transport & Claude Plugin Evolution

Scope

Key Features

Benefits

Phase 5: Pluggable Model Providers

Scope

Supported Embedding Providers

Supported Summarization/LLM Providers

Configuration Example

Benefits

Phase 6: PostgreSQL/AlloyDB Backend

Scope

Key Features

Benefits

Migration Path

Phase 7: AWS Bedrock Provider Support

Scope

Supported Models

Configuration

Authentication

Benefits

Phase 8: Google Vertex AI Provider Support

Scope

Supported Models

Configuration

Authentication

Benefits

Corpus/Book Generation Use Cases

Use Case 1: AWS CDK Documentation Corpus

Use Case 2: Claude/Anthropic Documentation Corpus

Use Case 3: Internal Technical Manual

Use Case 4: Open Source Project Contribution

Recommended Next Actions

Appendix: Numbering Convention

Uh oh!

Uh oh!