Skip to content

[draft] refactor: migrate ReadableStream to Chan<T> and AsyncIterable across codebase#1201

Open
toubatbrian wants to merge 3 commits intodevin/1775205203-chan-primitivefrom
devin/1775206253-readable-stream-migration
Open

[draft] refactor: migrate ReadableStream to Chan<T> and AsyncIterable across codebase#1201
toubatbrian wants to merge 3 commits intodevin/1775205203-chan-primitivefrom
devin/1775206253-readable-stream-migration

Conversation

@toubatbrian
Copy link
Copy Markdown
Contributor

@toubatbrian toubatbrian commented Apr 3, 2026

Description

Migrates all internal ReadableStream / TransformStream / WritableStream usage across the agents-js codebase to use Chan<T> (async channel queue) and AsyncIterable<T>, achieving Python parity. Builds on top of #1200 which introduced the Chan<T> and tee() primitives.

Changes Made

New: adapter utilities (stream/adapters.ts)

  • fromReadableStream<T>() — wraps a ReadableStream as AsyncIterable<T> (used at the boundary with AudioStream from rtc-node)
  • toReadableStream<T>() — wraps an AsyncIterable<T> back to ReadableStream (for any external APIs that still require it)
  • mergeAsyncIterables<T>() — merges N async iterables into one (replacement for MultiInputStream)
  • withIdleTimeout<T>() — wraps an AsyncIterable with per-.next() idle timeout, throwing IdleTimeoutError on stall (replaces the old waitUntilTimeout(reader.read(), ms) pattern)

Type signature changes

  • STTNode, LLMNode, TTSNode in voice/io.ts now use AsyncIterable<T> instead of ReadableStream<T>
  • SpeechStream, SynthesizeStream, VADStream iteration now uses AsyncIterable
  • RealtimeSession iteration changed similarly

StreamChannel → Chan replacements

  • audio.ts (audioFramesFromFile)
  • inference/stt.ts, inference/tts.ts
  • voice/audio_recognition.ts
  • voice/recorder_io/recorder_io.ts
  • voice/transcription/synchronizer.ts

TransformStream → async generator composition

  • inference/interruption/http_transport.ts — new TransportFn type, returns an async generator instead of a TransformStream
  • inference/interruption/ws_transport.ts — background pump pattern with Chan<OverlappingSpeechEvent> for WebSocket message routing
  • inference/interruption/interruption_stream.ts — three-stage pipeThrough() pipeline replaced with composed async generators: eventEmit(transportFn(audioTransform(inputChan)))

AudioInput base class refactor

  • MultiInputStream replaced with Chan<AudioFrame> + background pump + AbortController
  • addInputStream() aborts any previous pump before starting a new one, then pumps from the source iterable into the channel
  • close() aborts pump and closes channel
  • _pumpAbort changed from private to protected so subclasses (e.g. ParticipantAudioInputStream) can abort the pump

Plugin migration

  • Silero VAD: VADStream subclass migrated from this.inputReader.read() (old ReadableStream reader) to for await...of this.inputChan (new Chan pattern)

Test updates

  • utils.test.ts, generation_tools.test.ts, generation_tts_timeout.test.ts, audio_recognition_span.test.ts — all ReadableStream test helpers replaced with async generators

Updates since last revision

Addressed all review comments from Codex and Devin Review:

  1. interruptionStreamChannelinterruptionChan (Codex P1): Verified fully migrated — zero references to interruptionStreamChannel remain. The rename covers the field declaration, createInterruptionTask(), and disableInterruptionDetection().

  2. Idle timeout restored in TTS loops (Codex P1 ×2, Devin Review): Both performTTSInference and forwardAudio now use withIdleTimeout(ttsStream, TTS_READ_IDLE_TIMEOUT_MS), restoring the per-read idle timeout that was accidentally dropped during migration. IdleTimeoutError catch blocks are no longer dead code.

  3. AudioInput.addInputStream pump abort (Devin Review): Now calls this._pumpAbort?.abort() before creating a new AbortController, matching the pattern in STT/TTS/VAD/RealtimeSession updateInputStream methods.

  4. TTS SynthesizeStream.updateInputStream re-entrancy (Devin Review): The finally block now conditionally closes inputChan only when !abort.signal.aborted — i.e., only when the source iterable exhausted naturally. If the pump was aborted because updateInputStream was called again, the channel stays open for the new pump.

⚠️ Items for careful reviewer attention

  1. Backpressure: Many call sites changed from async channel.write() to synchronous chan.sendNowait(). This removes backpressure — if a producer is faster than a consumer, items queue unbounded in the channel. Verify this is acceptable for each call site (audio frames, TTS events, etc.).

  2. ws_transport.ts pump pattern (lines ~366-429): The new background pump + transportError variable + channel close pattern is the most complex change. Error propagation through this indirection needs scrutiny.

  3. synchronizer.ts sendNowait() calls (lines ~337, 374, 385): These are called without try/catch for ChanClosed. If the channel closes while mainTask is still running, these could throw unhandled.

  4. withIdleTimeout cleanup on stall: When the idle timeout fires, iter.return() is fire-and-forget because the source iterator may be stuck on a never-resolving promise. This means the stalled source generator won't be explicitly cleaned up — it relies on GC. Verify this is acceptable for TTS/audio streams.

  5. recorder_io.ts createInterceptingStream: Error handling simplified to bare catch {} — all source errors are silently swallowed.

  6. Tee usage: All tee() results now use .get(index) instead of bracket indexing (e.g. teed.get(0) not teed[0]). The .get() method throws RangeError on out-of-bounds — verify call sites always pass valid indices.

  7. TTS updateInputStream conditional close: The inputChan.close() in finally now checks !abort.signal.aborted. Verify the timing is safe — specifically that abort.signal.aborted cannot become true between the for await loop exiting normally and the finally block executing.

Pre-Review Checklist

  • Build passes: pnpm build passes across all packages (agents + all plugins)
  • AI-generated code reviewed: Removed unnecessary comments and ensured code quality
  • Changes explained: All changes are properly documented and justified above
  • Scope appropriate: All changes relate to the ReadableStream → Chan migration
  • Video demo: Not yet tested with Agent Playground

Testing

  • Lint passes (pnpm lint) — lint errors are pre-existing on the base branch
  • Format passes (pnpm format:check)
  • TypeScript build passes (pnpm build) — all packages compile cleanly
  • Automated tests pass — 843 passed, 3 failures are pre-existing on the base branch (OpenAI STT timeout, Gemini TTS import, TaskGroup)
  • restaurant_agent.ts and realtime_agent.ts verified — not yet tested

Additional Notes

  • This is a breaking change for any code depending on ReadableStream return types from node functions
  • Old stream utilities (IdentityTransform, StreamChannel, DeferredReadableStream, MultiInputStream) are still exported but no longer used internally
  • Remaining plugin-level migrations (beyond Silero) are deferred to a follow-up PR

Link to Devin session: https://livekit.devinenterprise.com/sessions/6f09b4044c3e4950ad2673781e2f0ba9
Requested by: @toubatbrian

…codebase

- Add adapter utilities (fromReadableStream, toReadableStream, mergeAsyncIterables)
- Migrate type signatures in voice/io.ts (STTNode, LLMNode, TTSNode)
- Migrate voice/generation.ts (LLM/TTS inference, text/audio forwarding)
- Migrate stt/stt.ts (SpeechStream async iteration)
- Migrate tts/tts.ts (SynthesizeStream async iteration)
- Migrate vad.ts (VADStream async iteration)
- Migrate llm/realtime.ts (RealtimeSession)
- Migrate audio_recognition.ts (DeferredReadableStream/StreamChannel to Chan)
- Migrate audio.ts (audioFramesFromFile)
- Migrate voice/agent_activity.ts (stream utilities)
- Migrate voice/agent.ts (node methods)
- Migrate voice/agent_session.ts (say method signature)
- Migrate voice/room_io/_input.ts (createStream method)
- Migrate voice/recorder_io/recorder_io.ts (StreamChannel to Chan)
- Migrate voice/transcription/synchronizer.ts (IdentityTransform to Chan)
- Migrate inference/interruption/* (TransformStream to async generators)
- Migrate inference/stt.ts (StreamChannel to Chan)
- Migrate inference/tts.ts (StreamChannel to Chan)
- Migrate utils.ts (resampleStream)
- Update test files to use async generators

Co-Authored-By: brian.yin <brian.yin@livekit.io>
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 3, 2026

⚠️ No Changeset found

Latest commit: 6eb8b29

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8057a55e89

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

private isInterruptionEnabled: boolean;
private isAgentSpeaking: boolean;
private interruptionStreamChannel?: StreamChannel<InterruptionSentinel | AudioFrame>;
private interruptionChan?: Chan<InterruptionSentinel | AudioFrame>;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep interruption task wired to the active channel

This refactor introduces interruptionChan, but createInterruptionTask() and disableInterruptionDetection() still read/write interruptionStreamChannel, so the interruption path is no longer wired to the field initialized in the constructor. As a result, interruption processing can be skipped entirely (the task bails before consuming overlap events), which breaks barge-in detection during agent speech.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was already fixed in commit be2627f — all references to interruptionStreamChannel were renamed to interruptionChan across the file (field declaration, createInterruptionTask(), and disableInterruptionDetection()). Verified with grep that zero references to interruptionStreamChannel remain in the codebase.

Comment on lines 630 to 633
for await (const frame of ttsStream) {
if (signal.aborted) {
break;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Restore idle timeout when reading TTS output

The loop now does a plain for await over ttsStream and no longer races reads against TTS_READ_IDLE_TIMEOUT_MS. If a provider yields some audio and then stalls without closing, this task can hang indefinitely and never reach normal completion semantics; the IdleTimeoutError catch is effectively dead in this path.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit be2627f — both performTTSInference and forwardAudio now wrap the TTS stream with withIdleTimeout(ttsStream, TTS_READ_IDLE_TIMEOUT_MS), restoring the per-read idle timeout. The IdleTimeoutError catch blocks are no longer dead code.

Comment on lines 792 to 795
for await (const frame of ttsStream) {
if (signal?.aborted) {
break;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reintroduce idle timeout in audio forwarding loop

Audio forwarding also switched from timeout-wrapped reads to a bare for await loop, so a stalled TTS iterable can block forever after playback has started. In that case performAudioForwarding() may never resolve and downstream turn cleanup waits indefinitely, despite the existing timeout-oriented error handling.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit be2627fforwardAudio now uses withIdleTimeout(ttsStream, TTS_READ_IDLE_TIMEOUT_MS) to wrap the for-await loop, matching the timeout behavior of the old waitUntilTimeout(reader.read(), ms) pattern.

devin-ai-integration[bot]

This comment was marked as resolved.

- Fix Tee<T> bracket indexing to use .get() method across all files
- Change _pumpAbort from private to protected in AudioInput for subclass access
- Fix interruptionStreamChannel -> interruptionChan property name mismatch
- Migrate silero plugin VADStream from inputReader to for-await on inputChan
- Add withIdleTimeout() adapter for AsyncIterable idle timeout support
- Restore TTS_READ_IDLE_TIMEOUT_MS in performTTSInference and forwardAudio
- Export withIdleTimeout from stream/index.ts

Co-Authored-By: brian.yin <brian.yin@livekit.io>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 new potential issues.

View 9 additional findings in Devin Review.

Open in Devin Review

Comment on lines +97 to 117
addInputStream(source: AsyncIterable<AudioFrame>): string {
const id = `input-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
const abort = new AbortController();
this._pumpAbort = abort;
(async () => {
try {
for await (const frame of source) {
if (abort.signal.aborted) break;
try {
this.inputChan.sendNowait(frame);
} catch (e) {
if (e instanceof ChanClosed) break;
throw e;
}
}
} catch {
// Source errors are silently consumed
}
})();
return id;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 AudioInput.addInputStream overwrites _pumpAbort without aborting the previous pump

When addInputStream is called multiple times, this._pumpAbort is overwritten by the new AbortController without first aborting the previous one. This leaves the old background pump running indefinitely, resulting in two concurrent pumps writing interleaved data into the same inputChan. Compare with all updateInputStream methods in STT (agents/src/stt/stt.ts:370), TTS (agents/src/tts/tts.ts:379), VAD (agents/src/vad.ts:172), and RealtimeSession (agents/src/llm/realtime.ts:168), which all correctly call this._pumpAbort?.abort() before creating a new pump. Current callers (e.g. ParticipantAudioInputStream) happen to abort externally via closeStream() before calling addInputStream, so this doesn't trigger today, but the method's contract is broken.

Suggested change
addInputStream(source: AsyncIterable<AudioFrame>): string {
const id = `input-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
const abort = new AbortController();
this._pumpAbort = abort;
(async () => {
try {
for await (const frame of source) {
if (abort.signal.aborted) break;
try {
this.inputChan.sendNowait(frame);
} catch (e) {
if (e instanceof ChanClosed) break;
throw e;
}
}
} catch {
// Source errors are silently consumed
}
})();
return id;
}
addInputStream(source: AsyncIterable<AudioFrame>): string {
const id = `input-${Date.now()}-${Math.random().toString(36).slice(2, 8)}`;
this._pumpAbort?.abort();
const abort = new AbortController();
this._pumpAbort = abort;
(async () => {
try {
for await (const frame of source) {
if (abort.signal.aborted) break;
try {
this.inputChan.sendNowait(frame);
} catch (e) {
if (e instanceof ChanClosed) break;
throw e;
}
}
} catch {
// Source errors are silently consumed
}
})();
return id;
}
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 6eb8b29addInputStream now calls this._pumpAbort?.abort() before creating the new AbortController, matching the pattern used in STT, TTS, VAD, and RealtimeSession's updateInputStream methods.

Comment on lines +378 to +398
updateInputStream(text: AsyncIterable<string>) {
this._pumpAbort?.abort();
const abort = new AbortController();
this._pumpAbort = abort;
(async () => {
try {
for await (const value of text) {
if (abort.signal.aborted) break;
try {
this.inputChan.sendNowait(value);
} catch (e) {
if (e instanceof ChanClosed) break;
throw e;
}
}
} catch {
// Source errors are silently consumed
} finally {
this.inputChan.close();
}
})();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 TTS SynthesizeStream.updateInputStream closes shared inputChan, breaking re-entrant calls

In updateInputStream, the pump's finally block calls this.inputChan.close(). When the method is called a second time, the sequence is: (1) this._pumpAbort?.abort() aborts pump1, (2) pump2 starts writing to inputChan, (3) pump1's for-await eventually exits and its finally runs this.inputChan.close(), (4) pump2's subsequent sendNowait calls throw ChanClosed. This is inconsistent with the STT (agents/src/stt/stt.ts:369-387) and VAD (agents/src/vad.ts:171-189) implementations of updateInputStream, which do NOT close the channel in their pump's finally block. Currently, updateInputStream is only called once per stream instance (in agents/src/voice/agent.ts:461), so this doesn't trigger in practice.

Prompt for agents
The TTS SynthesizeStream.updateInputStream method closes this.inputChan in the pump's finally block (line 396). This is needed to signal pumpInput() that input is done, but it makes the method non-reentrant: a second call will have its pump broken by the first pump's finally closing the shared channel.

The fix requires restructuring how pumpInput knows input is complete. Options:
1. Remove the close() from finally and instead have pumpInput check for a separate 'done' signal.
2. Create a new inputChan in updateInputStream (like attachAudioInput does in agent_activity.ts:593-594) so each call gets a fresh channel. But this requires pumpInput to follow the new channel reference.
3. Accept single-call semantics and document it, matching the current usage pattern.

Relevant files: agents/src/tts/tts.ts (updateInputStream at line 378, pumpInput at line 277), agents/src/stt/stt.ts (updateInputStream at line 369 for comparison).
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in commit 6eb8b29 — the finally block now only closes inputChan when the pump was NOT aborted (i.e., the source iterable exhausted naturally). If abort.signal.aborted is true (meaning updateInputStream was called again), the finally block skips the close so the new pump can continue using the shared channel. This makes the method safe for re-entrant calls while still signaling pumpInput when input is truly done.

… fix TTS updateInputStream re-entrancy

Co-Authored-By: brian.yin <brian.yin@livekit.io>
@toubatbrian toubatbrian changed the title refactor: migrate ReadableStream to Chan<T> and AsyncIterable across codebase [draft] refactor: migrate ReadableStream to Chan<T> and AsyncIterable across codebase Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant