last_validated	2026-03-16

Configuration Reference

This document provides a comprehensive reference for all Agent Brain configuration options, including environment variables, server settings, and per-project configuration.

Configuration Precedence
Server Configuration
Embedding Configuration
Chunking Configuration
Query Configuration
GraphRAG Configuration
Multi-Instance Configuration
Storage Configuration
Strict Mode
Job Queue Configuration
Embedding Cache Configuration
Reranking Configuration
Per-Project Configuration
Example Configurations

Configuration Precedence

Settings are resolved in this order (first match wins):

Command-line flags: agent-brain start --port 8080
Environment variables: export API_PORT=8080
Project config: .agent-brain/config.json
Global config: ~/.agent-brain/config.json (future)
Built-in defaults: Defined in settings.py

Server Configuration

API Host and Port

Variable	Default	Description
`API_HOST`	`127.0.0.1`	IP address to bind to
`API_PORT`	`8000`	Port number (0 = auto-assign)
`DEBUG`	`false`	Enable debug mode with auto-reload

Examples:

# Bind to all interfaces (accessible from network)
export API_HOST="0.0.0.0"

# Use a specific port
export API_PORT="8080"

# Enable debug mode
export DEBUG="true"

CLI Override:

agent-brain start --host 0.0.0.0 --port 8080 --reload

Server Modes

Mode	Description
`project`	Per-project isolated server (default with `--daemon`)
`shared`	Single server for multiple projects (future)

export AGENT_BRAIN_MODE="project"

Embedding Configuration

OpenAI Embeddings

Variable	Default	Description
`OPENAI_API_KEY`	(required)	OpenAI API key
`EMBEDDING_MODEL`	`text-embedding-3-large`	Embedding model name
`EMBEDDING_DIMENSIONS`	`3072`	Vector dimensions
`EMBEDDING_BATCH_SIZE`	`100`	Chunks per API call

Examples:

# Required: OpenAI API key
export OPENAI_API_KEY="sk-proj-..."

# Use smaller model for cost savings
export EMBEDDING_MODEL="text-embedding-3-small"
export EMBEDDING_DIMENSIONS="1536"

Anthropic API (Summarization)

Variable	Default	Description
`ANTHROPIC_API_KEY`	(optional)	Anthropic API key
`CLAUDE_MODEL`	`claude-haiku-4-5-20251001`	Claude model for summaries

Examples:

# Optional: Enable LLM summaries and GraphRAG extraction
export ANTHROPIC_API_KEY="sk-ant-..."
export CLAUDE_MODEL="claude-haiku-4-5-20251001"

Chunking Configuration

Text Document Chunking

Variable	Default	Range	Description
`DEFAULT_CHUNK_SIZE`	`512`	128-2048	Target chunk size in tokens
`DEFAULT_CHUNK_OVERLAP`	`50`	0-200	Overlap between chunks
`MAX_CHUNK_SIZE`	`2048`	-	Maximum allowed chunk size
`MIN_CHUNK_SIZE`	`128`	-	Minimum allowed chunk size

Examples:

# Larger chunks for detailed documents
export DEFAULT_CHUNK_SIZE="800"
export DEFAULT_CHUNK_OVERLAP="100"

CLI Override:

agent-brain index /path --chunk-size 800 --overlap 100

Code Chunking

Code chunking uses different defaults optimized for source code:

Setting	Default	Description
`chunk_lines`	`40`	Target lines per chunk
`chunk_lines_overlap`	`15`	Line overlap
`max_chars`	`1500`	Maximum characters

These are set in the CodeChunker class and can be customized programmatically.

Query Configuration

Default Query Settings

Variable	Default	Range	Description
`DEFAULT_TOP_K`	`5`	1-50	Results to return
`MAX_TOP_K`	`50`	-	Maximum allowed top_k
`DEFAULT_SIMILARITY_THRESHOLD`	`0.7`	0.0-1.0	Minimum similarity

Examples:

# Return more results by default
export DEFAULT_TOP_K="10"

# Lower threshold for broader matches
export DEFAULT_SIMILARITY_THRESHOLD="0.5"

CLI Override:

agent-brain query "search term" --top-k 10 --threshold 0.5

Query Modes

Mode	Alpha	Description
`bm25`	N/A	Keyword-only search
`vector`	N/A	Semantic-only search
`hybrid`	`0.5`	BM25 + Vector fusion
`graph`	N/A	Graph traversal
`multi`	N/A	All three with RRF

Alpha Parameter (hybrid mode only):

1.0: Pure vector search
0.5: Balanced (default)
0.0: Pure BM25 search

GraphRAG Configuration

Enable/Disable

Variable	Default	Description
`ENABLE_GRAPH_INDEX`	`false`	Master switch for GraphRAG

Example:

# Enable GraphRAG
export ENABLE_GRAPH_INDEX="true"

Graph Storage

Variable	Default	Options	Description
`GRAPH_STORE_TYPE`	`simple`	`simple`, `kuzu`	Storage backend
`GRAPH_INDEX_PATH`	`./graph_index`	Path	Storage location

Examples:

# Use default in-memory store (development)
export GRAPH_STORE_TYPE="simple"

# Use Kuzu for production
export GRAPH_STORE_TYPE="kuzu"

Entity Extraction

Variable	Default	Description
`GRAPH_USE_CODE_METADATA`	`true`	Extract from AST metadata
`GRAPH_USE_LLM_EXTRACTION`	`true`	Use LLM for extraction
`GRAPH_EXTRACTION_MODEL`	`claude-haiku-4-5`	LLM model for extraction
`GRAPH_MAX_TRIPLETS_PER_CHUNK`	`10`	Limit per chunk

Examples:

# Code-only extraction (no LLM costs)
export GRAPH_USE_CODE_METADATA="true"
export GRAPH_USE_LLM_EXTRACTION="false"

# Full extraction with fast model
export GRAPH_USE_LLM_EXTRACTION="true"
export GRAPH_EXTRACTION_MODEL="claude-haiku-4-5"

Graph Query

Variable	Default	Range	Description
`GRAPH_TRAVERSAL_DEPTH`	`2`	1-4	Hops to traverse
`GRAPH_RRF_K`	`60`	20-100	RRF constant

Examples:

# Deeper traversal for complex relationships
export GRAPH_TRAVERSAL_DEPTH="3"

# Adjust RRF fusion (lower = more weight on top ranks)
export GRAPH_RRF_K="40"

Multi-Instance Configuration

State Directory

Variable	Default	Description
`AGENT_BRAIN_STATE_DIR`	`None`	Override state directory location
`AGENT_BRAIN_MODE`	`project`	Instance mode: `project` or `shared`

Legacy aliases: DOC_SERVE_STATE_DIR is still read by provider_config.py as a fallback if AGENT_BRAIN_STATE_DIR is not set.

Examples:

# Explicit state directory
export AGENT_BRAIN_STATE_DIR="/path/to/.agent-brain"

# Project mode (default)
export AGENT_BRAIN_MODE="project"

CLI Options

# Start with explicit state directory
agent-brain start --state-dir /path/to/.agent-brain

# Start with project directory (auto-resolves state)
agent-brain start --project-dir /path/to/project

Query Cache Configuration

Agent Brain caches query results in memory to avoid redundant storage lookups for repeated identical queries. The cache is invalidated automatically whenever a reindex job completes, ensuring freshness after every index update.

Query Cache

Variable	Default	Description
`QUERY_CACHE_TTL`	`300`	Time-to-live for cached query results in seconds. Set to a high value for mostly-static indexes, lower for frequently-updated ones.
`QUERY_CACHE_MAX_SIZE`	`256`	Maximum number of query results to cache. When full, least-recently-used entries are evicted by TTLCache.

Notes:

Query cache is in-memory only (no disk persistence). Cache is empty after server restart.
graph and multi query modes are never cached (non-deterministic LLM extraction).
Cache is automatically invalidated on every successful reindex job completion.
Cache statistics are visible in the /health/status response under the query_cache key.

# Example: longer TTL for a static documentation server
QUERY_CACHE_TTL=3600
QUERY_CACHE_MAX_SIZE=512

Strict Mode

Variable	Default	Description
`AGENT_BRAIN_STRICT_MODE`	`false`	Fail on critical validation errors instead of logging warnings

When enabled, the server will raise errors for validation issues that would otherwise be logged as warnings (e.g., invalid chunk sizes, missing required metadata).

# Enable strict validation
export AGENT_BRAIN_STRICT_MODE="true"

Job Queue Configuration

Controls the background job queue used for indexing operations.

Variable	Default	Description
`AGENT_BRAIN_MAX_QUEUE`	`100`	Maximum number of pending jobs in the queue
`AGENT_BRAIN_JOB_TIMEOUT`	`7200`	Job timeout in seconds (default: 2 hours)
`AGENT_BRAIN_MAX_RETRIES`	`3`	Maximum retry attempts for failed jobs
`AGENT_BRAIN_CHECKPOINT_INTERVAL`	`50`	Save progress checkpoint every N files
`AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS`	`30`	File watcher debounce delay in seconds

Examples:

# Increase queue size for large projects
export AGENT_BRAIN_MAX_QUEUE="500"

# Longer timeout for very large codebases
export AGENT_BRAIN_JOB_TIMEOUT="14400"

# More frequent checkpoints
export AGENT_BRAIN_CHECKPOINT_INTERVAL="25"

# Shorter debounce for faster file-watch response
export AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS="10"

Embedding Cache Configuration

Controls the two-tier (memory + disk) embedding cache that avoids redundant OpenAI API calls for previously-seen content.

Variable	Default	Description
`EMBEDDING_CACHE_MAX_DISK_MB`	`500`	Maximum disk cache size in megabytes
`EMBEDDING_CACHE_MAX_MEM_ENTRIES`	`1000`	Maximum in-memory LRU cache entries
`EMBEDDING_CACHE_PERSIST_STATS`	`false`	Persist hit/miss statistics across server restarts

Examples:

# Larger disk cache for big repos
export EMBEDDING_CACHE_MAX_DISK_MB="2000"

# Larger in-memory cache
export EMBEDDING_CACHE_MAX_MEM_ENTRIES="5000"

# Persist cache statistics for monitoring
export EMBEDDING_CACHE_PERSIST_STATS="true"

Reranking Configuration

Controls the optional two-stage reranking pipeline that improves search relevance by using a cross-encoder model to rescore initial retrieval results.

Variable	Default	Description
`ENABLE_RERANKING`	`false`	Master switch for reranking
`RERANKER_PROVIDER`	`sentence-transformers`	Reranker backend (`sentence-transformers` or `ollama`)
`RERANKER_MODEL`	`cross-encoder/ms-marco-MiniLM-L-6-v2`	Cross-encoder model name
`RERANKER_TOP_K_MULTIPLIER`	`10`	Stage 1 retrieves `top_k * multiplier` candidates
`RERANKER_MAX_CANDIDATES`	`100`	Maximum Stage 1 candidates (caps the multiplier)

Examples:

# Enable reranking
export ENABLE_RERANKING="true"

# Use a different cross-encoder model
export RERANKER_MODEL="cross-encoder/ms-marco-TinyBERT-L-2-v2"

# Retrieve more candidates for reranking
export RERANKER_TOP_K_MULTIPLIER="20"
export RERANKER_MAX_CANDIDATES="200"

Storage Configuration

Storage Backend Selection

Agent Brain supports multiple storage backends:

chroma (default)
postgres

Selection order (first match wins):

AGENT_BRAIN_STORAGE_BACKEND environment variable
storage.backend in config.yaml
Built-in default (chroma)

Example (config.yaml):

storage:
  backend: "postgres"  # or "chroma"

Environment override:

export AGENT_BRAIN_STORAGE_BACKEND="postgres"

PostgreSQL Backend (pgvector)

When storage.backend is postgres, configure connection and pool settings under storage.postgres:

storage:
  backend: "postgres"
  postgres:
    host: "localhost"
    port: 5432
    database: "agent_brain"
    user: "agent_brain"
    password: "agent_brain_dev"
    pool_size: 10
    pool_max_overflow: 10
    pool_timeout: 30
    language: "english"
    hnsw_m: 16
    hnsw_ef_construction: 64
    debug: false

PostgreSQL connection and pool keys:

Key	Type	Default	Description
`host`	string	`"localhost"`	Database host
`port`	int	`5432`	Database port
`database`	string	`"agent_brain"`	Database name
`user`	string	`"agent_brain"`	Database user
`password`	string	`""`	Database password
`pool_size`	int	`10`	Connections to keep in the pool
`pool_max_overflow`	int	`10`	Extra connections above `pool_size`
`pool_timeout`	int	`30`	Seconds to wait for a pool connection before timeout
`language`	string	`"english"`	Full-text search language
`hnsw_m`	int	`16`	HNSW index M parameter
`hnsw_ef_construction`	int	`64`	HNSW construction parameter
`debug`	bool	`false`	Enable SQLAlchemy debug logging

Connection string override:

DATABASE_URL overrides the host/user/password/database/port connection string, but pool settings and HNSW tuning remain in YAML.

export DATABASE_URL="postgresql+asyncpg://agent_brain:agent_brain_dev@localhost:5432/agent_brain"

ChromaDB Vector Store

Variable	Default	Description
`CHROMA_PERSIST_DIR`	`./chroma_db`	ChromaDB storage location
`COLLECTION_NAME`	`agent_brain_collection`	Collection name

Examples:

# Custom storage location
export CHROMA_PERSIST_DIR="/data/agent-brain/vectors"

BM25 Index

Variable	Default	Description
`BM25_INDEX_PATH`	`./bm25_index`	BM25 index storage

Examples:

# Custom BM25 storage
export BM25_INDEX_PATH="/data/agent-brain/bm25"

Per-Project Configuration

config.json

Create .agent-brain/config.json for project-specific settings:

{
  "bind_host": "127.0.0.1",
  "port_range_start": 8000,
  "port_range_end": 8100,
  "auto_port": true,
  "chunk_size": 512,
  "chunk_overlap": 50,
  "exclude_patterns": [
    "**/node_modules/**",
    "**/__pycache__/**",
    "**/.venv/**",
    "**/venv/**",
    "**/.git/**",
    "**/dist/**",
    "**/build/**",
    "**/target/**"
  ]
}

Configuration Schema

Field	Type	Default	Description
`bind_host`	string	`"127.0.0.1"`	Server bind address
`port_range_start`	integer	`8000`	Start of port range for auto-port
`port_range_end`	integer	`8100`	End of port range for auto-port
`auto_port`	boolean	`true`	Automatically find available port
`chunk_size`	integer	`512`	Chunk size in tokens
`chunk_overlap`	integer	`50`	Chunk overlap in tokens
`exclude_patterns`	array	(see example)	Glob patterns to exclude from indexing

Example Configurations

Development Setup

Minimal configuration for local development:

# .env
OPENAI_API_KEY=sk-proj-...
DEBUG=true
DEFAULT_TOP_K=10
DEFAULT_SIMILARITY_THRESHOLD=0.5

Production Setup

Full configuration for production deployment:

# .env
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...

# Server
API_HOST=127.0.0.1
API_PORT=8000
DEBUG=false

# Embedding
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=3072
EMBEDDING_BATCH_SIZE=100

# Query defaults
DEFAULT_TOP_K=5
DEFAULT_SIMILARITY_THRESHOLD=0.7

# Storage
CHROMA_PERSIST_DIR=/data/agent-brain/vectors
BM25_INDEX_PATH=/data/agent-brain/bm25

# GraphRAG (optional)
ENABLE_GRAPH_INDEX=true
GRAPH_STORE_TYPE=kuzu
GRAPH_INDEX_PATH=/data/agent-brain/graph
GRAPH_USE_CODE_METADATA=true
GRAPH_USE_LLM_EXTRACTION=true
GRAPH_EXTRACTION_MODEL=claude-haiku-4-5
GRAPH_TRAVERSAL_DEPTH=2

# Embedding cache
EMBEDDING_CACHE_MAX_DISK_MB=2000
EMBEDDING_CACHE_MAX_MEM_ENTRIES=5000

# Job queue
AGENT_BRAIN_MAX_QUEUE=500
AGENT_BRAIN_JOB_TIMEOUT=7200
AGENT_BRAIN_CHECKPOINT_INTERVAL=50

# Reranking (optional)
ENABLE_RERANKING=true
RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2

Code-Heavy Repository

Configuration optimized for source code:

# .env
OPENAI_API_KEY=sk-proj-...

# Larger chunks for code
DEFAULT_CHUNK_SIZE=800
DEFAULT_CHUNK_OVERLAP=100

# GraphRAG for code relationships
ENABLE_GRAPH_INDEX=true
GRAPH_USE_CODE_METADATA=true
GRAPH_USE_LLM_EXTRACTION=false  # Code metadata is sufficient

Project config (.agent-brain/config.json):

{
  "bind_host": "127.0.0.1",
  "port_range_start": 8000,
  "port_range_end": 8100,
  "auto_port": true,
  "chunk_size": 800,
  "chunk_overlap": 100,
  "exclude_patterns": [
    "**/node_modules/**",
    "**/__pycache__/**",
    "**/dist/**",
    "**/build/**"
  ]
}

Documentation-Only Setup

Configuration for pure documentation search:

# .env
OPENAI_API_KEY=sk-proj-...

# Smaller chunks for precise documentation
DEFAULT_CHUNK_SIZE=400
DEFAULT_CHUNK_OVERLAP=50

# No GraphRAG needed
ENABLE_GRAPH_INDEX=false

Project config:

{
  "bind_host": "127.0.0.1",
  "port_range_start": 8000,
  "port_range_end": 8100,
  "auto_port": true,
  "chunk_size": 400,
  "chunk_overlap": 50,
  "exclude_patterns": [
    "**/node_modules/**",
    "**/__pycache__/**",
    "**/.git/**"
  ]
}

Cost-Optimized Setup

Minimize API costs:

# .env
OPENAI_API_KEY=sk-proj-...

# Use smaller embedding model
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536

# Disable LLM extraction
GRAPH_USE_LLM_EXTRACTION=false

Environment File Locations

Agent Brain searches for .env files in this order:

Current working directory: ./.env
Server package directory: agent-brain-server/.env
Project root: ../.env

Best Practice: Place .env in your project root and add to .gitignore.

Validation

Check Current Configuration

# View server status (includes some config)
agent-brain status

# View all environment variables
env | grep -E "(OPENAI|ANTHROPIC|EMBEDDING|GRAPH|CHUNK|API)"

Test Configuration

# Start server and check health
agent-brain start --daemon
curl http://127.0.0.1:8000/health

# Index test documents
agent-brain index ./docs --include-code

# Test query
agent-brain query "test" --mode hybrid

Next Steps

API Reference - REST API documentation
Deployment Guide - Production deployment
Architecture Overview - System design

FilesExpand file tree

CONFIGURATION.md

Latest commit

History

CONFIGURATION.md

File metadata and controls

Configuration Reference

Table of Contents

Configuration Precedence

Server Configuration

API Host and Port

Server Modes

Embedding Configuration

OpenAI Embeddings

Anthropic API (Summarization)

Chunking Configuration

Text Document Chunking

Code Chunking

Query Configuration

Default Query Settings

Query Modes

GraphRAG Configuration

Enable/Disable

Graph Storage

Entity Extraction

Graph Query

Multi-Instance Configuration

State Directory

CLI Options

Query Cache Configuration

Query Cache

Strict Mode

Job Queue Configuration

Embedding Cache Configuration

Reranking Configuration

Storage Configuration

Storage Backend Selection

PostgreSQL Backend (pgvector)

ChromaDB Vector Store

BM25 Index

Per-Project Configuration

config.json

Configuration Schema

Example Configurations

Development Setup

Production Setup

Code-Heavy Repository

Documentation-Only Setup

Cost-Optimized Setup

Environment File Locations

Validation

Check Current Configuration

Test Configuration

Next Steps