| last_validated | 2026-03-16 |
|---|
This document provides a comprehensive reference for all Agent Brain configuration options, including environment variables, server settings, and per-project configuration.
- Configuration Precedence
- Server Configuration
- Embedding Configuration
- Chunking Configuration
- Query Configuration
- GraphRAG Configuration
- Multi-Instance Configuration
- Storage Configuration
- Strict Mode
- Job Queue Configuration
- Embedding Cache Configuration
- Reranking Configuration
- Per-Project Configuration
- Example Configurations
Settings are resolved in this order (first match wins):
- Command-line flags:
agent-brain start --port 8080 - Environment variables:
export API_PORT=8080 - Project config:
.agent-brain/config.json - Global config:
~/.agent-brain/config.json(future) - Built-in defaults: Defined in
settings.py
| Variable | Default | Description |
|---|---|---|
API_HOST |
127.0.0.1 |
IP address to bind to |
API_PORT |
8000 |
Port number (0 = auto-assign) |
DEBUG |
false |
Enable debug mode with auto-reload |
Examples:
# Bind to all interfaces (accessible from network)
export API_HOST="0.0.0.0"
# Use a specific port
export API_PORT="8080"
# Enable debug mode
export DEBUG="true"CLI Override:
agent-brain start --host 0.0.0.0 --port 8080 --reload| Mode | Description |
|---|---|
project |
Per-project isolated server (default with --daemon) |
shared |
Single server for multiple projects (future) |
export AGENT_BRAIN_MODE="project"| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
(required) | OpenAI API key |
EMBEDDING_MODEL |
text-embedding-3-large |
Embedding model name |
EMBEDDING_DIMENSIONS |
3072 |
Vector dimensions |
EMBEDDING_BATCH_SIZE |
100 |
Chunks per API call |
Examples:
# Required: OpenAI API key
export OPENAI_API_KEY="sk-proj-..."
# Use smaller model for cost savings
export EMBEDDING_MODEL="text-embedding-3-small"
export EMBEDDING_DIMENSIONS="1536"| Variable | Default | Description |
|---|---|---|
ANTHROPIC_API_KEY |
(optional) | Anthropic API key |
CLAUDE_MODEL |
claude-haiku-4-5-20251001 |
Claude model for summaries |
Examples:
# Optional: Enable LLM summaries and GraphRAG extraction
export ANTHROPIC_API_KEY="sk-ant-..."
export CLAUDE_MODEL="claude-haiku-4-5-20251001"| Variable | Default | Range | Description |
|---|---|---|---|
DEFAULT_CHUNK_SIZE |
512 |
128-2048 | Target chunk size in tokens |
DEFAULT_CHUNK_OVERLAP |
50 |
0-200 | Overlap between chunks |
MAX_CHUNK_SIZE |
2048 |
- | Maximum allowed chunk size |
MIN_CHUNK_SIZE |
128 |
- | Minimum allowed chunk size |
Examples:
# Larger chunks for detailed documents
export DEFAULT_CHUNK_SIZE="800"
export DEFAULT_CHUNK_OVERLAP="100"CLI Override:
agent-brain index /path --chunk-size 800 --overlap 100Code chunking uses different defaults optimized for source code:
| Setting | Default | Description |
|---|---|---|
chunk_lines |
40 |
Target lines per chunk |
chunk_lines_overlap |
15 |
Line overlap |
max_chars |
1500 |
Maximum characters |
These are set in the CodeChunker class and can be customized programmatically.
| Variable | Default | Range | Description |
|---|---|---|---|
DEFAULT_TOP_K |
5 |
1-50 | Results to return |
MAX_TOP_K |
50 |
- | Maximum allowed top_k |
DEFAULT_SIMILARITY_THRESHOLD |
0.7 |
0.0-1.0 | Minimum similarity |
Examples:
# Return more results by default
export DEFAULT_TOP_K="10"
# Lower threshold for broader matches
export DEFAULT_SIMILARITY_THRESHOLD="0.5"CLI Override:
agent-brain query "search term" --top-k 10 --threshold 0.5| Mode | Alpha | Description |
|---|---|---|
bm25 |
N/A | Keyword-only search |
vector |
N/A | Semantic-only search |
hybrid |
0.5 |
BM25 + Vector fusion |
graph |
N/A | Graph traversal |
multi |
N/A | All three with RRF |
Alpha Parameter (hybrid mode only):
1.0: Pure vector search0.5: Balanced (default)0.0: Pure BM25 search
| Variable | Default | Description |
|---|---|---|
ENABLE_GRAPH_INDEX |
false |
Master switch for GraphRAG |
Example:
# Enable GraphRAG
export ENABLE_GRAPH_INDEX="true"| Variable | Default | Options | Description |
|---|---|---|---|
GRAPH_STORE_TYPE |
simple |
simple, kuzu |
Storage backend |
GRAPH_INDEX_PATH |
./graph_index |
Path | Storage location |
Examples:
# Use default in-memory store (development)
export GRAPH_STORE_TYPE="simple"
# Use Kuzu for production
export GRAPH_STORE_TYPE="kuzu"| Variable | Default | Description |
|---|---|---|
GRAPH_USE_CODE_METADATA |
true |
Extract from AST metadata |
GRAPH_USE_LLM_EXTRACTION |
true |
Use LLM for extraction |
GRAPH_EXTRACTION_MODEL |
claude-haiku-4-5 |
LLM model for extraction |
GRAPH_MAX_TRIPLETS_PER_CHUNK |
10 |
Limit per chunk |
Examples:
# Code-only extraction (no LLM costs)
export GRAPH_USE_CODE_METADATA="true"
export GRAPH_USE_LLM_EXTRACTION="false"
# Full extraction with fast model
export GRAPH_USE_LLM_EXTRACTION="true"
export GRAPH_EXTRACTION_MODEL="claude-haiku-4-5"| Variable | Default | Range | Description |
|---|---|---|---|
GRAPH_TRAVERSAL_DEPTH |
2 |
1-4 | Hops to traverse |
GRAPH_RRF_K |
60 |
20-100 | RRF constant |
Examples:
# Deeper traversal for complex relationships
export GRAPH_TRAVERSAL_DEPTH="3"
# Adjust RRF fusion (lower = more weight on top ranks)
export GRAPH_RRF_K="40"| Variable | Default | Description |
|---|---|---|
AGENT_BRAIN_STATE_DIR |
None |
Override state directory location |
AGENT_BRAIN_MODE |
project |
Instance mode: project or shared |
Legacy aliases:
DOC_SERVE_STATE_DIRis still read byprovider_config.pyas a fallback ifAGENT_BRAIN_STATE_DIRis not set.
Examples:
# Explicit state directory
export AGENT_BRAIN_STATE_DIR="/path/to/.agent-brain"
# Project mode (default)
export AGENT_BRAIN_MODE="project"# Start with explicit state directory
agent-brain start --state-dir /path/to/.agent-brain
# Start with project directory (auto-resolves state)
agent-brain start --project-dir /path/to/projectAgent Brain caches query results in memory to avoid redundant storage lookups for repeated identical queries. The cache is invalidated automatically whenever a reindex job completes, ensuring freshness after every index update.
| Variable | Default | Description |
|---|---|---|
QUERY_CACHE_TTL |
300 |
Time-to-live for cached query results in seconds. Set to a high value for mostly-static indexes, lower for frequently-updated ones. |
QUERY_CACHE_MAX_SIZE |
256 |
Maximum number of query results to cache. When full, least-recently-used entries are evicted by TTLCache. |
Notes:
- Query cache is in-memory only (no disk persistence). Cache is empty after server restart.
graphandmultiquery modes are never cached (non-deterministic LLM extraction).- Cache is automatically invalidated on every successful reindex job completion.
- Cache statistics are visible in the
/health/statusresponse under thequery_cachekey.
# Example: longer TTL for a static documentation server
QUERY_CACHE_TTL=3600
QUERY_CACHE_MAX_SIZE=512| Variable | Default | Description |
|---|---|---|
AGENT_BRAIN_STRICT_MODE |
false |
Fail on critical validation errors instead of logging warnings |
When enabled, the server will raise errors for validation issues that would otherwise be logged as warnings (e.g., invalid chunk sizes, missing required metadata).
# Enable strict validation
export AGENT_BRAIN_STRICT_MODE="true"Controls the background job queue used for indexing operations.
| Variable | Default | Description |
|---|---|---|
AGENT_BRAIN_MAX_QUEUE |
100 |
Maximum number of pending jobs in the queue |
AGENT_BRAIN_JOB_TIMEOUT |
7200 |
Job timeout in seconds (default: 2 hours) |
AGENT_BRAIN_MAX_RETRIES |
3 |
Maximum retry attempts for failed jobs |
AGENT_BRAIN_CHECKPOINT_INTERVAL |
50 |
Save progress checkpoint every N files |
AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS |
30 |
File watcher debounce delay in seconds |
Examples:
# Increase queue size for large projects
export AGENT_BRAIN_MAX_QUEUE="500"
# Longer timeout for very large codebases
export AGENT_BRAIN_JOB_TIMEOUT="14400"
# More frequent checkpoints
export AGENT_BRAIN_CHECKPOINT_INTERVAL="25"
# Shorter debounce for faster file-watch response
export AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS="10"Controls the two-tier (memory + disk) embedding cache that avoids redundant OpenAI API calls for previously-seen content.
| Variable | Default | Description |
|---|---|---|
EMBEDDING_CACHE_MAX_DISK_MB |
500 |
Maximum disk cache size in megabytes |
EMBEDDING_CACHE_MAX_MEM_ENTRIES |
1000 |
Maximum in-memory LRU cache entries |
EMBEDDING_CACHE_PERSIST_STATS |
false |
Persist hit/miss statistics across server restarts |
Examples:
# Larger disk cache for big repos
export EMBEDDING_CACHE_MAX_DISK_MB="2000"
# Larger in-memory cache
export EMBEDDING_CACHE_MAX_MEM_ENTRIES="5000"
# Persist cache statistics for monitoring
export EMBEDDING_CACHE_PERSIST_STATS="true"Controls the optional two-stage reranking pipeline that improves search relevance by using a cross-encoder model to rescore initial retrieval results.
| Variable | Default | Description |
|---|---|---|
ENABLE_RERANKING |
false |
Master switch for reranking |
RERANKER_PROVIDER |
sentence-transformers |
Reranker backend (sentence-transformers or ollama) |
RERANKER_MODEL |
cross-encoder/ms-marco-MiniLM-L-6-v2 |
Cross-encoder model name |
RERANKER_TOP_K_MULTIPLIER |
10 |
Stage 1 retrieves top_k * multiplier candidates |
RERANKER_MAX_CANDIDATES |
100 |
Maximum Stage 1 candidates (caps the multiplier) |
Examples:
# Enable reranking
export ENABLE_RERANKING="true"
# Use a different cross-encoder model
export RERANKER_MODEL="cross-encoder/ms-marco-TinyBERT-L-2-v2"
# Retrieve more candidates for reranking
export RERANKER_TOP_K_MULTIPLIER="20"
export RERANKER_MAX_CANDIDATES="200"Agent Brain supports multiple storage backends:
chroma(default)postgres
Selection order (first match wins):
AGENT_BRAIN_STORAGE_BACKENDenvironment variablestorage.backendinconfig.yaml- Built-in default (
chroma)
Example (config.yaml):
storage:
backend: "postgres" # or "chroma"Environment override:
export AGENT_BRAIN_STORAGE_BACKEND="postgres"When storage.backend is postgres, configure connection and pool settings
under storage.postgres:
storage:
backend: "postgres"
postgres:
host: "localhost"
port: 5432
database: "agent_brain"
user: "agent_brain"
password: "agent_brain_dev"
pool_size: 10
pool_max_overflow: 10
pool_timeout: 30
language: "english"
hnsw_m: 16
hnsw_ef_construction: 64
debug: falsePostgreSQL connection and pool keys:
| Key | Type | Default | Description |
|---|---|---|---|
host |
string | "localhost" |
Database host |
port |
int | 5432 |
Database port |
database |
string | "agent_brain" |
Database name |
user |
string | "agent_brain" |
Database user |
password |
string | "" |
Database password |
pool_size |
int | 10 |
Connections to keep in the pool |
pool_max_overflow |
int | 10 |
Extra connections above pool_size |
pool_timeout |
int | 30 |
Seconds to wait for a pool connection before timeout |
language |
string | "english" |
Full-text search language |
hnsw_m |
int | 16 |
HNSW index M parameter |
hnsw_ef_construction |
int | 64 |
HNSW construction parameter |
debug |
bool | false |
Enable SQLAlchemy debug logging |
Connection string override:
DATABASE_URL overrides the host/user/password/database/port connection
string, but pool settings and HNSW tuning remain in YAML.
export DATABASE_URL="postgresql+asyncpg://agent_brain:agent_brain_dev@localhost:5432/agent_brain"| Variable | Default | Description |
|---|---|---|
CHROMA_PERSIST_DIR |
./chroma_db |
ChromaDB storage location |
COLLECTION_NAME |
agent_brain_collection |
Collection name |
Examples:
# Custom storage location
export CHROMA_PERSIST_DIR="/data/agent-brain/vectors"| Variable | Default | Description |
|---|---|---|
BM25_INDEX_PATH |
./bm25_index |
BM25 index storage |
Examples:
# Custom BM25 storage
export BM25_INDEX_PATH="/data/agent-brain/bm25"Create .agent-brain/config.json for project-specific settings:
{
"bind_host": "127.0.0.1",
"port_range_start": 8000,
"port_range_end": 8100,
"auto_port": true,
"chunk_size": 512,
"chunk_overlap": 50,
"exclude_patterns": [
"**/node_modules/**",
"**/__pycache__/**",
"**/.venv/**",
"**/venv/**",
"**/.git/**",
"**/dist/**",
"**/build/**",
"**/target/**"
]
}| Field | Type | Default | Description |
|---|---|---|---|
bind_host |
string | "127.0.0.1" |
Server bind address |
port_range_start |
integer | 8000 |
Start of port range for auto-port |
port_range_end |
integer | 8100 |
End of port range for auto-port |
auto_port |
boolean | true |
Automatically find available port |
chunk_size |
integer | 512 |
Chunk size in tokens |
chunk_overlap |
integer | 50 |
Chunk overlap in tokens |
exclude_patterns |
array | (see example) | Glob patterns to exclude from indexing |
Minimal configuration for local development:
# .env
OPENAI_API_KEY=sk-proj-...
DEBUG=true
DEFAULT_TOP_K=10
DEFAULT_SIMILARITY_THRESHOLD=0.5Full configuration for production deployment:
# .env
OPENAI_API_KEY=sk-proj-...
ANTHROPIC_API_KEY=sk-ant-...
# Server
API_HOST=127.0.0.1
API_PORT=8000
DEBUG=false
# Embedding
EMBEDDING_MODEL=text-embedding-3-large
EMBEDDING_DIMENSIONS=3072
EMBEDDING_BATCH_SIZE=100
# Query defaults
DEFAULT_TOP_K=5
DEFAULT_SIMILARITY_THRESHOLD=0.7
# Storage
CHROMA_PERSIST_DIR=/data/agent-brain/vectors
BM25_INDEX_PATH=/data/agent-brain/bm25
# GraphRAG (optional)
ENABLE_GRAPH_INDEX=true
GRAPH_STORE_TYPE=kuzu
GRAPH_INDEX_PATH=/data/agent-brain/graph
GRAPH_USE_CODE_METADATA=true
GRAPH_USE_LLM_EXTRACTION=true
GRAPH_EXTRACTION_MODEL=claude-haiku-4-5
GRAPH_TRAVERSAL_DEPTH=2
# Embedding cache
EMBEDDING_CACHE_MAX_DISK_MB=2000
EMBEDDING_CACHE_MAX_MEM_ENTRIES=5000
# Job queue
AGENT_BRAIN_MAX_QUEUE=500
AGENT_BRAIN_JOB_TIMEOUT=7200
AGENT_BRAIN_CHECKPOINT_INTERVAL=50
# Reranking (optional)
ENABLE_RERANKING=true
RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2Configuration optimized for source code:
# .env
OPENAI_API_KEY=sk-proj-...
# Larger chunks for code
DEFAULT_CHUNK_SIZE=800
DEFAULT_CHUNK_OVERLAP=100
# GraphRAG for code relationships
ENABLE_GRAPH_INDEX=true
GRAPH_USE_CODE_METADATA=true
GRAPH_USE_LLM_EXTRACTION=false # Code metadata is sufficientProject config (.agent-brain/config.json):
{
"bind_host": "127.0.0.1",
"port_range_start": 8000,
"port_range_end": 8100,
"auto_port": true,
"chunk_size": 800,
"chunk_overlap": 100,
"exclude_patterns": [
"**/node_modules/**",
"**/__pycache__/**",
"**/dist/**",
"**/build/**"
]
}Configuration for pure documentation search:
# .env
OPENAI_API_KEY=sk-proj-...
# Smaller chunks for precise documentation
DEFAULT_CHUNK_SIZE=400
DEFAULT_CHUNK_OVERLAP=50
# No GraphRAG needed
ENABLE_GRAPH_INDEX=falseProject config:
{
"bind_host": "127.0.0.1",
"port_range_start": 8000,
"port_range_end": 8100,
"auto_port": true,
"chunk_size": 400,
"chunk_overlap": 50,
"exclude_patterns": [
"**/node_modules/**",
"**/__pycache__/**",
"**/.git/**"
]
}Minimize API costs:
# .env
OPENAI_API_KEY=sk-proj-...
# Use smaller embedding model
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSIONS=1536
# Disable LLM extraction
GRAPH_USE_LLM_EXTRACTION=falseAgent Brain searches for .env files in this order:
- Current working directory:
./.env - Server package directory:
agent-brain-server/.env - Project root:
../.env
Best Practice: Place .env in your project root and add to .gitignore.
# View server status (includes some config)
agent-brain status
# View all environment variables
env | grep -E "(OPENAI|ANTHROPIC|EMBEDDING|GRAPH|CHUNK|API)"# Start server and check health
agent-brain start --daemon
curl http://127.0.0.1:8000/health
# Index test documents
agent-brain index ./docs --include-code
# Test query
agent-brain query "test" --mode hybrid- API Reference - REST API documentation
- Deployment Guide - Production deployment
- Architecture Overview - System design