Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions docs/dive-deep/asynchronous-indexing-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,39 @@ The flow diagram above shows the complete indexing workflow, illustrating how th
- **`indexfailed`** - ❌ Error occurred, can retry
- **`not_found`** - ❌ Not indexed yet

## How Progress Is Calculated

`get_indexing_status` reports a **coarse, phase-based percentage**, not an exact fraction of files completed.

- **0%** - Preparing the target collection and validating indexing prerequisites
- **~5%** - Scanning the codebase and building the file list
- **10% → 100%** - Processing files, chunking code, generating embeddings, and writing batches to the vector database
- **100%** - Indexing finished successfully

This means it is normal for indexing to jump to around `10%` quickly on a large codebase. It reflects a transition from setup phases into file processing, not that exactly one tenth of all files are already indexed.

Progress is also persisted periodically to the local MCP snapshot file at `~/.context/mcp-codebase-snapshot.json`, so very fast phases may appear as jumps rather than smooth increments.

## When File and Chunk Counts Appear

`get_indexing_status` shows file and chunk totals after a run has completed and the final statistics have been written to the local snapshot.

During active indexing, the MCP server tracks progress percentage, but it does **not** stream live file/chunk totals through `get_indexing_status`.

If you see an indexed entry with `0 files, 0 chunks`, that usually means the local snapshot metadata is stale or was created by an older / incomplete bookkeeping path. It is not a live count fetched from the vector database at status-check time.

To refresh those stored totals, clear and re-index the **same absolute path**.

## How Codebases Are Identified

Claude Context tracks codebases by their resolved **absolute path**.

- The MCP tools resolve relative paths to absolute paths before indexing, searching, clearing, or checking status.
- Collection identity is derived from the normalized absolute path.
- If you index the same repository through different absolute paths (for example, a symlink, a different clone, or a mounted path), Claude Context treats them as separate codebases.

For the most predictable behavior, always use the same absolute path for `index_codebase`, `search_code`, `clear_index`, and `get_indexing_status`.


## Key Benefits

Expand Down
30 changes: 30 additions & 0 deletions docs/troubleshooting/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,38 @@ You can seamlessly use queries like `index this codebase` or `search the main fu
- **Background Code Synchronization**: Continuously monitors for changes and automatically re-indexes modified parts
- **Context-Aware Operations**: All indexing and search operations are scoped to the current project context

**Important path detail:** Claude Context keys each indexed codebase by its absolute path. If you index the same repository through different paths (for example, a symlinked path, a second clone, or a mounted path), those are treated as separate indexed codebases.

This makes it effortless to work across multiple projects while maintaining isolated, up-to-date indexes for each codebase.

## Q: Why does `get_indexing_status` jump quickly to 10% or feel coarse?

**A:** The percentage is a **phase-based progress indicator**, not a live fraction of indexed files.

In practice, Claude Context moves through broad stages:

- collection preparation
- file scanning
- file processing, chunking, embedding, and insertion

The status output can therefore jump quickly to around `10%` once setup is complete, even for very large repositories. That is expected behavior.

For the full background workflow, see [Asynchronous Indexing Workflow](../dive-deep/asynchronous-indexing-workflow.md).

## Q: Why does `get_indexing_status` show `0 files, 0 chunks` for a completed codebase?

**A:** `get_indexing_status` reads the MCP snapshot metadata, not a live aggregate directly from the vector database.

If a completed entry shows `0 files, 0 chunks`, the most common explanation is that the local snapshot metadata is stale or was created before final statistics were refreshed.

What to do:

1. Make sure you are checking the **same absolute path** that you originally indexed.
2. If the entry still shows zero counts, run `clear_index` for that path.
3. Re-run `index_codebase` for that exact absolute path.

This refreshes the stored file/chunk totals used by `get_indexing_status`.

## Q: How does Claude Context compare to other coding tools like Serena, Context7, or DeepWiki?

**A:** Claude Context is specifically focused on **codebase indexing and semantic search**. Here's how we compare:
Expand Down
10 changes: 10 additions & 0 deletions packages/mcp/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -668,6 +668,16 @@ Get the current indexing status of a codebase. Shows progress percentage for act

- `path` (required): Absolute path to the codebase directory to check status for

**What the status output means:**

- Progress is **phase-based**, not a direct file-count ratio. The MCP server reports coarse milestones for collection preparation, file scanning, and file processing / embedding work.
- Because indexing runs in the background and progress is persisted periodically, percentages can jump quickly on large repositories or appear unchanged for a while during long embedding batches.
- File and chunk statistics are written when an indexing run finishes successfully. During active indexing, `get_indexing_status` intentionally reports progress rather than live file/chunk totals.
- Codebases are keyed by their **absolute path**. Indexing `/repo`, a symlinked path to the same repo, and a second clone will create separate tracked entries.
- If a completed entry shows `0 files, 0 chunks`, that usually means the local snapshot metadata is stale rather than the vector database being queried live. Re-indexing, or clearing and re-indexing that exact absolute path, refreshes the stored stats.

For a deeper explanation, see the [asynchronous indexing workflow guide](../../docs/dive-deep/asynchronous-indexing-workflow.md) and the [troubleshooting FAQ](../../docs/troubleshooting/faq.md).

## Contributing

This package is part of the Claude Context monorepo. Please see:
Expand Down
Loading