Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,18 +102,24 @@ bun test # Run tests
## Design Principles

- **No direct LiveKit imports in examples/consumer code.** If an example needs to import from `@livekit/components-react` or `livekit-client`, that's a signal the SDK isn't exposing enough. Treat every direct LiveKit import as a missing SDK API surface.
- **Default control chrome:** bundled `styles.css` treats mic/camera/screen-share “off” as neutral dimmed (not error red), uses a blue accent while screen sharing, and reserves red for end-call; tune off-state with `--avatar-control-bg-off` and `--avatar-control-color-off`. When sharing, default layout shows screen share as the main region with avatar picture-in-picture; `useLocalMedia.toggleScreenShare` uses `CaptureController` + `setFocusBehavior('no-focus-change')` when the browser supports it so picking another tab to share is less likely to steal focus (otherwise degrades gracefully).

## Learned User Preferences

- Do not merge or close pull requests on the user's behalf unless they explicitly ask; they merge and close PRs themselves.

## Learned Workspace Facts

- Release flow follows `CONTRIBUTING.md` — bump version, update changelog, commit, push, then `gh release create` triggers the NPM publish workflow
- `consumeSession` API converts `sessionId + sessionKey` → WebRTC credentials (`serverUrl`, `token`, `roomName`); this step is handled client-side by the SDK
- Primary quickstart reference is `examples/nextjs/` (API routes, more universally understood than server actions); documentation lives on an external docs website — `docs/` and `skills/` folders were intentionally removed from the repo
- Dev scripts auto-detect portless (`command -v portless`) and use it when available; there are no separate `dev:portless` scripts; VS Code launch configs (`.vscode/launch.json`) are the primary way to start example dev servers, with `preLaunchTask` linking the package first
- Release flow follows `CONTRIBUTING.md` — bump version, update changelog, commit, push, then `gh release create` triggers the NPM publish workflow; changelog only tracks changes that affect the npm package (not examples, tests, or docs); the publish workflow auto-detects prerelease versions (e.g. `0.10.0-beta.0`) and publishes to npm with the prerelease identifier as the dist-tag (`--tag beta`)
- `consumeSession` converts `sessionId + sessionKey` → WebRTC credentials (`serverUrl`, `token`, `roomName`) on the client in the SDK; `@runwayml/avatars-react/api` is the server-safe entry point (no React, no `'use client'`) for Next.js API routes or server components so imports do not pull in the client bundle
- `AvatarSession` connects `LiveKitRoom` with local `audio` and `video` off, then enables the mic and camera after the room reaches `Connected` so exclusive camera/mic use by another app (e.g. Zoom) does not block the session from becoming active; media acquisition failures surface via `MediaDeviceError` context / `useLocalMedia` with retry instead of leaving the room stuck connecting
- Primary quickstart reference is `examples/nextjs/` (API routes, more universally understood than server actions); documentation lives on an external docs website — `docs/` and `skills/` folders were intentionally removed from the repo; screen share is demonstrated there with explicit children (`<ScreenShareVideo />` + `<ControlBar showScreenShare />`) because `showScreenShare` defaults `false` for backwards compatibility
- Dev scripts auto-detect portless (`command -v portless`) and use it when available; there are no separate `dev:portless` scripts; VS Code launch configs (`.vscode/launch.json`) are the primary way to start example dev servers — each example's `preLaunchTask` runs `bun run build && cd examples/<name> && bun link @runwayml/avatars-react` once at launch (not watch mode); run `bun run dev` at the repo root ("Build Package (Watch)" launch config) in a separate terminal for continuous SDK rebuilds while iterating
- Graphite `gt submit` only works after the GitHub repo is added under **Synced repos** in Graphite ([settings](https://app.graphite.dev/settings/synced-repos)); otherwise it errors with “You can only submit to repos synced with Graphite” (org admins may need to enable the Graphite GitHub app for `runwayml/avatars-sdk-react`). The CLI resolves the repo from `git remote origin` — it must match that GitHub slug exactly (do not confuse the NPM package name `@runwayml/avatars-react` with the repo path `runwayml/avatars-sdk-react`, or typo `avatar-sdk-react` vs `avatars-sdk-react`). Until then, use `git push -u origin <branch>` + `gh pr create`. Local commands (`gt ls`, `gt sync`, `gt checkout`, `gt modify`, `gt create`) still work.
- Client events are fire-and-forget messages from the avatar model delivered via LiveKit data channel (`RoomEvent.DataReceived`); exposed through `onClientEvent` prop, `useClientEvents<T>` (catch-all), and `useClientEvent<E, T>` (filtered by tool name; latest args as state + optional callback); server also sends ack messages with `args: { status: "event_sent" }` that `parseClientEvent` filters out; examples with rich UI should include a `/dev` page for testing states (question cards, score, confetti, error) without a live avatar session
- Client tool helpers use the `client` prefix (`clientTool`, `ClientEventsFrom`, etc.) to distinguish from planned "server tools" that can call back and send messages to the model; `clientTool()` only emits `{ type, name, description }` — the `args` field is phantom (TypeScript-only, never sent to the API); when passing tools to `realtimeSessions.create({ tools })`, client event tools need explicit `parameters` arrays for the model to populate args correctly; array-type parameters require an `items` field (e.g. `{ type: 'array', items: { type: 'string' } }`) or the API returns 400; follow-up: accept Standard Schema (Zod, Valibot, ArkType) for `args` to get runtime validation and inferred types without `as` casts
- `@runwayml/avatars-react/api` is the server-safe entry point (no React, no `'use client'`); use it for imports needed in Next.js API routes or server components to avoid pulling in the client bundle
- Session creation avatar field uses `{ type: 'runway-preset', presetId }` for built-in presets and `{ type: 'custom', avatarId }` for custom avatars — passing a UUID as `presetId` will 400; Runway `avatars.retrieve` is only for custom avatar UUIDs, not preset slugs (calling it with a preset id returns 400 — examples should use static preset metadata or hardcoded client data)
- The client-events trivia example (`examples/nextjs-client-events/`) keeps session `personality` and `startScript` as repo constants in `lib/trivia-personality.ts` (passed from the connect route); keep personality within the API character limit (~2000); realtime create fields may still be cast with `as any` until `@runwayml/sdk` types include them; the RPC trivia example (`examples/nextjs-rpc/`) adds `@runwayml/avatars-node-rpc` (GitHub dep) for backend tool calls — `next.config.ts` must include `serverExternalPackages: ['@runwayml/avatars-node-rpc', '@livekit/rtc-node']`; backend RPC events stream to the client via SSE (`EventEmitter` + `/api/avatar/events` route); both examples use preset avatars (`runway-preset`) with `personality`/`startScript`/`tools` overrides for the production API
- The trivia examples (`examples/nextjs-client-events/`, `examples/nextjs-rpc/`) use a single `next_step` client event per turn with personality/startScript as repo constants in `lib/trivia-personality.ts`; keep personality within the API character limit (~2000) and each tool's `timeoutSeconds` ≤8; exceeding the char limit returns a length-specific 400; a *different* 400 ("This text cannot be used for an avatar") is content moderation — avoid pop culture character names and suggestive phrasing in `personality`/`startScript`; realtime create fields may still be cast with `as any` until `@runwayml/sdk` types include them; the RPC trivia example adds `@runwayml/avatars-node-rpc` (GitHub dep) — `next.config.ts` must include `serverExternalPackages: ['@runwayml/avatars-node-rpc', '@livekit/rtc-node']`; `examples/nextjs-rpc-weather/`
is a standalone RPC-only example (no client events); all tool-calling examples use preset avatars (`runway-preset`) with `personality`/`startScript`/`tools` overrides targeting the production API
- Cross-session audio routing (two avatars hearing each other) is not supported by the SDK; achieving avatar-to-avatar conversation requires Web Audio API bridging in the browser or server-side LiveKit audio forwarding — the SDK intentionally does not expose the underlying LiveKit room object to consumer code
- Public-facing examples target production API only (`new Runway()` with no `baseURL` override); don't build multi-environment infrastructure — keep `.env.example` minimal (just `RUNWAYML_API_SECRET`); hardcode preset IDs and other constants directly in code rather than env var indirection; for internal staging/dev testing, pass `baseUrl` to `AvatarCall`/`AvatarSession` and set `NEXT_PUBLIC_RUNWAYML_BASE_URL` in `.env.local`
- Publish workflow auto-detects prerelease versions (e.g., `0.10.0-beta.0`) and uses the prerelease identifier as the npm dist-tag (`--tag beta`)
2 changes: 0 additions & 2 deletions examples/nextjs/app/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ import {
AvatarCall,
AvatarVideo,
ControlBar,
ScreenShareVideo,
UserVideo,
} from '@runwayml/avatars-react';
import '@runwayml/avatars-react/styles.css';
Expand Down Expand Up @@ -215,7 +214,6 @@ export default function Home() {
}}
>
<AvatarVideo />
<ScreenShareVideo />
<UserVideo />
<ControlBar showScreenShare />
</AvatarCall>
Expand Down
21 changes: 20 additions & 1 deletion src/hooks/useLocalMedia.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,16 @@ import { useLatest } from './useLatest';

const NOOP_ASYNC = async () => {};

function createCaptureController(): unknown {
if (typeof window === 'undefined' || !('CaptureController' in window)) {
return undefined;
}
// biome-ignore lint/suspicious/noExplicitAny: CaptureController not yet in TypeScript's lib
const controller = new (window as any).CaptureController();
controller.setFocusBehavior('no-focus-change');
return controller;
}

/**
* Hook for local media controls (mic, camera, screen share).
*
Expand Down Expand Up @@ -62,7 +72,16 @@ export function useLocalMedia(): UseLocalMediaReturn {

// biome-ignore lint/correctness/useExhaustiveDependencies: refs from useLatest are stable
const toggleScreenShare = useCallback(() => {
localParticipant?.setScreenShareEnabled(!isScreenShareEnabledRef.current);
const next = !isScreenShareEnabledRef.current;
if (next) {
const controller = createCaptureController();
localParticipant?.setScreenShareEnabled(true, {
controller,
surfaceSwitching: 'include',
});
} else {
localParticipant?.setScreenShareEnabled(false);
}
}, [localParticipant]);

const tracks = useTracks(
Expand Down
7 changes: 4 additions & 3 deletions src/styles.css
Original file line number Diff line number Diff line change
Expand Up @@ -228,10 +228,11 @@

/* ScreenShareVideo */
[data-avatar-screen-share] {
position: absolute;
inset: 0;
flex: 1;
width: 100%;
min-height: 0;
background: #000;
z-index: 1;
z-index: 0;
}

[data-avatar-screen-share] video {
Expand Down
Loading