Skip to content

feat: Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues | Add Document Embedding Status Events#5192

Open
angelplusultra wants to merge 45 commits intomasterfrom
feat-native-embedder-job-queue
Open

feat: Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues | Add Document Embedding Status Events#5192
angelplusultra wants to merge 45 commits intomasterfrom
feat-native-embedder-job-queue

Conversation

@angelplusultra
Copy link
Contributor

@angelplusultra angelplusultra commented Mar 11, 2026

Pull Request Type

  • ✨ feat (New feature)
  • 🐛 fix (Bug fix)
  • ♻️ refactor (Code refactoring without changing behavior)
  • 💄 style (UI style changes)
  • 🔨 chore (Build, CI, maintenance)
  • 📝 docs (Documentation updates)

Relevant Issues

resolves #

Description

Moves the native embedder and reranker from the main process into isolated child processes using child_process.fork() with a serial job queue. This prevents OOM crashes from large document batches from taking down the entire server, while adding real-time document-level progress reporting to the UI via SSE.

Architecture:

  • WorkerQueue class manages a forked child process with serial FIFO job processing, auto-fork on first job, and configurable idle timeout that kills the worker when inactive
  • EmbeddingProgressBus (singleton EventEmitter) acts as the central hub for progress events between Document.addDocuments and SSE endpoint listeners, with event buffering for late-joining clients after page reloads
  • Separate queue instances for embedding and reranking workers — each gets its own forked process
  • Query embedding (lightweight, single text) still runs in-process; only bulk document embedding is routed through the worker queue

Progress Reporting:

  • SSE endpoint at /workspace/:slug/embed-progress streams document-level events (batch_starting, doc_starting, doc_complete, doc_failed, all_complete)
  • EmbeddingProgressContext (React Context) manages progress state globally and connects SSE on component mount — server-side event replay catches up on any in-progress jobs without needing client-side persistence
  • Progress is scoped per-user — each user only sees their own embedding jobs, even when multiple users embed into the same workspace concurrently
  • Progress UI shows in the document management modal with per-file status (Queued → Embedding → Complete/Failed) and auto-clears 5 seconds after completion
  • When no embedding is in progress, the SSE connection stays open silently and waits — no premature signals are sent, avoiding race conditions between the SSE connection and the embed API call

Worker Timeouts:

  • Configurable via environment variables (NATIVE_EMBEDDING_WORKER_TIMEOUT, NATIVE_RERANKING_WORKER_TIMEOUT) with defaults of 300s and 900s respectively
  • Embedding timeout is configurable in the native embedder settings UI
  • Reranking timeout is configurable in the vector database settings when LanceDB is selected
  • Timeouts are re-read from env before each job so UI changes take effect without server restart

Other Changes:

  • NativeEmbedder.embedChunks() detects whether it's running in the main process or worker via process.send and routes through the worker queue automatically — no changes needed in vector DB providers or non-native embedding engines
  • NativeEmbeddingReranker.rerankViaWorker() added as static method to route through the queue from LanceDB, using the same process.send detection pattern
  • batch_starting event emitted at the start of a batch with the full file list, so SSE history replay can seed all files as "pending" for late-joining clients
  • doc_failed event emitted when fileData() returns null or when the database write fails (previously files were silently skipped, stuck as "pending" in UI)

Visuals (if applicable)

Embedding Status Events w/ Persistence Across Renders and Reloads

output.mp4

Worker Timeouts Are Configurable
image

The document management modal now shows real-time embedding progress when documents are being embedded into a workspace, with per-file status indicators (Queued, Embedding, Complete, Failed).

Additional Information

  • Worker idle timeouts can be set to 0 for immediate shutdown after work completes
  • The reranking worker has a longer default timeout (900s) since it runs on every chat query when accuracy-optimized search is enabled with LanceDB, and frequent cold starts would add overhead
  • Event replay on SSE reconnect ensures no stale "Queued" states after page refresh — the server buffers all document-level events and replays them to new subscribers
  • Multi-user: progress is scoped per-user via SSE userId filtering. No cross-user visibility or document locking.

Developer Validations

  • I ran yarn lint from the root of the repo & committed changes
  • Relevant documentation has been updated (if applicable)
  • I have tested my code functionality
  • Docker build succeeds locally

@angelplusultra angelplusultra marked this pull request as draft March 11, 2026 17:27
@angelplusultra angelplusultra changed the title feat: Move native embedder & reranker into isolated worker processes feat: Native Embedder and Reranker Job Queue & Document Embedding Status Events Mar 11, 2026
@angelplusultra angelplusultra changed the title feat: Native Embedder and Reranker Job Queue & Document Embedding Status Events feat:Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues & Add Document Embedding Status Events Mar 11, 2026
@angelplusultra angelplusultra changed the title feat:Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues & Add Document Embedding Status Events feat: Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues & Add Document Embedding Status Events Mar 11, 2026
@angelplusultra angelplusultra changed the title feat: Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues & Add Document Embedding Status Events feat: Move Native Embedder and Reranker Into Isolated Workers w/ Job Queues | Add Document Embedding Status Events Mar 11, 2026
@angelplusultra angelplusultra marked this pull request as ready for review March 11, 2026 21:50
@angelplusultra angelplusultra marked this pull request as draft March 11, 2026 22:32
@angelplusultra angelplusultra marked this pull request as draft March 12, 2026 19:02
The SSE connection opens before the embedding API call fires, so the
server sees no buffered history and immediately sends all_complete.
Firefox dispatches this eagerly enough that it closes the EventSource
before real progress events arrive, causing the progress UI to clear
and fall back to the loading spinner. Chrome's EventSource timing
masks the race.

Track slugs where startEmbedding was called but no real progress event
has arrived yet via awaitingProgressRef. Ignore the first all_complete
for those slugs and keep the connection open for the real events.
Removed unnecessary tracking of slugs for premature all_complete events in the EmbeddingProgressProvider. Updated the server-side logic to avoid sending all_complete when no embedding is in progress, allowing the connection to remain open for real events. Adjusted the embedding initiation flow to ensure the server processes the job before the SSE connection opens, improving the reliability of progress updates.
…component

Extracted the Reranking Worker Idle Timeout input from GeneralEmbeddingPreference and integrated it into the LanceDBOptions component. This change enhances modularity and maintains a cleaner structure for the settings interface.
@angelplusultra angelplusultra marked this pull request as ready for review March 13, 2026 19:17
Copy link
Member

@timothycarambat timothycarambat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Timeout ≠ TTL. This looks like we are talking about how long we should keep workers alive after doing some work just to keep the worker hot. This is a TTL, so lets rename that.

  • Lets remove the UI components to specify TTL for now. Most people will not ever want to touch these nor want to even change it. Node Worker time to start is small enough to shoulder here.

  • Lets also then remove the associated systemSettings key entires since we wont be sending them to the UI. We should keep the protectedKeys in dumpENV just in case people DO want to set them manually.

Clarifying questions:

  • How are embedding jobs user segmented? If user A is embedding 10 docs and user B is embedding 1 - when A is queued and B's job finishes and gets all_complete doesnt A get that event back at the same time or is this simply based on the job ref in their renderer processes

Comment on lines +348 to +352
module.exports = {
queueEmbedding,
queueReranking,
embeddingProgressBus,
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this whole file, lets find a better way to break this up - this file is pretty messy and lots of different things going on in the same file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@angelplusultra
Copy link
Contributor Author

angelplusultra commented Mar 13, 2026

  • Timeout ≠ TTL. This looks like we are talking about how long we should keep workers alive after doing some work just to keep the worker hot. This is a TTL, so lets rename that.

    • Lets remove the UI components to specify TTL for now. Most people will not ever want to touch these nor want to even change it. Node Worker time to start is small enough to shoulder here.

    • Lets also then remove the associated systemSettings key entires since we wont be sending them to the UI. We should keep the protectedKeys in dumpENV just in case people DO want to set them manually.

Clarifying questions:

How are embedding jobs user segmented? If user A is embedding 10 docs and user B is embedding 1 - when A is queued and B's job finishes and gets all_complete doesnt A get that event back at the same time or is this simply based on the job ref in their renderer processes

Each user's "Save" triggers its own addDocuments call, which runs its own loop and emits its own batch_startingdoc_startingdoc_completeall_complete lifecycle. User B's all_complete only signals the end of User B's batch — it doesn't touch User A's. On the SSE side, every event is tagged with userId, and the subscriber filters on it. So User A's SSE connection only receives events where event.userId matches their own. User B's all_complete is simply never delivered to User A. The two batches are independent event streams that happen to flow through the same bus. They don't interfere with each other.

You can see this in action in the EmbeddingProgressBus:

  /**
  * Register an SSE listener filtered by workspace and user.
  * Replays any buffered events for the workspace before subscribing to live events.
  * @param {{ workspaceSlug: string, userId?: number }} filter
  * @param {function} callback - receives the progress event payload
  * @returns {{ unsubscribe: function }}
  */
 subscribe(filter, callback) {
   // Replay buffered events so reconnecting clients catch up.
   if (filter.workspaceSlug && this.#history.has(filter.workspaceSlug)) {
     for (const event of this.#history.get(filter.workspaceSlug)) {
       if (filter.userId && event.userId && event.userId !== filter.userId)
         continue;
       callback(event);
     }
   }

   const handler = (event) => {
     if (filter.workspaceSlug && event.workspaceSlug !== filter.workspaceSlug)
       return;
     if (filter.userId && event.userId && event.userId !== filter.userId)
       return;
     callback(event);
   };
   this.on("progress", handler);
   return {
     unsubscribe: () => this.off("progress", handler),
   };
 }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants