diff --git a/CHANGELOG.md b/CHANGELOG.md index 804ff77..8175dc5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,27 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/). +## [Unreleased] + +### Added + +- ConvoAI quickstart gating regression cases in `tests/eval-cases.md` — working-baseline detection, no `/join` bypass, and quickstart-skip coverage +- ConvoAI vendor-default coverage in `tests/eval-cases.md` — Python SDK-backed first-success provider combo and default-parameter checks + +### Changed + +- `SKILL.md`, `references/conversational-ai/README.md`: changed documentation lookup to a strict local-reference-first policy so ConvoAI requests consult bundled module references before any Level 2 live-doc fetch +- `SKILL.md`: added stronger direct-routing cues for clearly ConvoAI-specific requests such as agent demos, provider questions, and MLLM requests instead of sending them to intake first +- `references/conversational-ai/README.md`: added working-baseline routing so new-project and unproven integration requests enter a constrained quickstart path before code generation +- `references/conversational-ai/quickstarts.md`: rewritten as a locked quickstart state machine with baseline-path, readiness, and backend-path gates; preserves the existing repo/setup references after the gates resolve +- `references/conversational-ai/quickstarts.md`, `references/conversational-ai/python-sdk.md`, `references/conversational-ai/README.md`: now use the official current provider docs as the source of truth for provider matrices and vendor-specific configs, while keeping the local quickstart focused on the first-success default combo and sample-aligned env names +- `references/conversational-ai/quickstarts.md`, `references/conversational-ai/README.md`: aligned the sequence with the state machine, made the MLLM vs cascading split explicit in the vendor gate, documented baseline-path rollback behavior, and clarified that Path B may require a private repo fallback +- `references/conversational-ai/quickstarts.md`: softened the opening quickstart wording for user-facing conversations and added an explicit unsupported-provider prompt instead of implicit discouragement +- `references/conversational-ai/quickstarts.md`, `references/conversational-ai/README.md`: added a Studio Agent ID branch so Agora ConvoAI can reuse agents configured in `https://console.agora.io/studio/agents` instead of rebuilding the provider stack during quickstart +- `references/conversational-ai/conversational-ai-studio.md`: added a dedicated reference for the Agora Studio Agent ID path and clarified that it is different from the runtime `agent_id` returned by `/join` +- `references/conversational-ai/conversational-ai-studio.md`, `references/conversational-ai/quickstarts.md`, `references/conversational-ai/README.md`: documented the confirmed mapping that the Agora Studio Agent ID is passed via the request field `pipeline_id` +- `references/conversational-ai/conversational-ai-studio.md`: expanded the Studio path into a fixed request contract mirroring the preconfigured-agent flow, including field mapping, token separation, and response expectations + ## [1.2.0] ### Added diff --git a/skills/agora/SKILL.md b/skills/agora/SKILL.md index f2a1d1a..a6a4292 100644 --- a/skills/agora/SKILL.md +++ b/skills/agora/SKILL.md @@ -81,18 +81,35 @@ Examples of clear requests: - "RTC Web video call" → `references/rtc/web.md` - "ConvoAI Python" → `references/conversational-ai/README.md` +- "I want to build a demo that talks to an agent" → `references/conversational-ai/README.md` +- "What providers does ConvoAI support?" → `references/conversational-ai/README.md` +- "I want MLLM with Gemini" → `references/conversational-ai/README.md` +- "I already have an Agent ID from Agora Studio" → `references/conversational-ai/README.md` - "Generate RTC token in Go" → `references/server/tokens.md` **Vague or multi-product request:** Route through `intake/SKILL.md`. +Only do this when the product is still genuinely unclear after checking for obvious +ConvoAI / RTC / RTM / Cloud Recording / Server Gateway / token-server cues. Intake handles product identification, combination recommendations, and routing. ## Documentation Lookup -Check bundled references first (Level 1). If they don't cover the detail needed, +Check bundled references first (Level 1). Start with the most relevant local module file +for the user's product and question. If the local reference does not cover the needed detail, fetch `https://docs.agora.io/en/llms.txt`, find the relevant URL, and fetch it (Level 2). See [references/doc-fetching.md](references/doc-fetching.md) for the full procedure, fallback URLs, and freeze-forever decision table. -**Always fetch Level 2 before answering questions about**: TTS/ASR/LLM vendor configs, model names, full request/response schemas, error code listings, or release notes. These change frequently — do not answer from training data or memory. +**Local-first rule:** never skip the relevant local module reference just because live docs exist. +Read the local module first, then fetch Level 2 only if: + +- the local file does not cover the needed detail +- the user asks for the complete latest matrix +- the question is about exact current request/response schemas +- the question is about error code listings or release notes + +For ConvoAI vendor/provider questions, route to `references/conversational-ai/README.md` first. +That module decides whether the bundled ConvoAI references are enough or whether the official +current provider docs must be fetched. **If MCP is unavailable or Level 2 fetch fails**: use the fallback URLs in `doc-fetching.md` to reach the official markdown docs directly. Never fabricate API parameters — always tell the user to verify against official docs if live fetch is unavailable. diff --git a/skills/agora/references/conversational-ai/README.md b/skills/agora/references/conversational-ai/README.md index b4e8201..d03b5a0 100644 --- a/skills/agora/references/conversational-ai/README.md +++ b/skills/agora/references/conversational-ai/README.md @@ -2,16 +2,30 @@ REST API-driven voice AI agents. Create agents that join RTC channels and converse with users via speech. Front-end clients connect via RTC+RTM. -## Start Here: New Projects +## Routing: Classify the Request -**Building a new Conversational AI agent? Clone a quickstart repo — do not build from scratch.** +The key question: does the user already have a **working ConvoAI baseline**? -| Path | Repo | Use when | -|---|---|---| -| **Full-stack Next.js** (default) | [agent-quickstart-nextjs](https://github.com/AgoraIO-Conversational-AI/agent-quickstart-nextjs) | Single repo: Next.js API routes + React UI | -| **Python backend + React frontend** | [conversational-ai-quickstart](https://github.com/AgoraIO-Community/conversational-ai-quickstart) *(private)* | Separate Python server + standalone React client | +- **Working baseline** = an Agora ConvoAI agent has already been started successfully end to end, and the client can join the same RTC channel and interact with it. +- **Not a working baseline** = only RTC code exists, a sample repo is cloned but not proven, env vars are present, or the user only knows the target backend language. -See **[quickstarts.md](quickstarts.md)** for clone steps, env vars, and setup instructions. +| Mode | When | Route to | +|---|---|---| +| `quickstart` | Starting from scratch, first demo, wants the official baseline | [quickstarts.md](quickstarts.md) | +| `integration` | Has an app or repo, but the ConvoAI path is not proven end to end yet | [quickstarts.md](quickstarts.md) | +| `backend-implementation` | Working baseline confirmed, now needs server code or lifecycle/auth changes | [server-sdks.md](server-sdks.md), [python-sdk.md](python-sdk.md), [go-sdk.md](go-sdk.md), or [auth-flow.md](auth-flow.md) | +| `client-customization` | Working baseline confirmed, now needs transcripts, hooks, UI, or mobile client work | [agent-toolkit.md](agent-toolkit.md), [agent-client-toolkit-react.md](agent-client-toolkit-react.md), [agent-ui-kit.md](agent-ui-kit.md), [agent-toolkit-ios.md](agent-toolkit-ios.md), [agent-toolkit-android.md](agent-toolkit-android.md) | +| `studio-agent` | The user already has an Agora Studio Agent ID and wants to reuse that Studio-managed agent config | [quickstarts.md](quickstarts.md), then [conversational-ai-studio.md](conversational-ai-studio.md) | +| `advanced-feature` / `debugging` / `ops-hardening` | Working baseline confirmed, wants custom LLM, memory, webhooks, production hardening, or error diagnosis | Start in this file, then route to the relevant reference below | + +### Routing Rules + +- If the user does **not** have a working baseline yet, read only this file and [quickstarts.md](quickstarts.md). +- While quickstart is unresolved, do **not** generate `/join` payloads, propose a custom project structure, or jump straight into SDK code. +- Existing RTC code or a checked-out repo is not enough to skip quickstart; the ConvoAI path must already work once. +- If the user explicitly says the baseline already works, skip quickstart and route directly to the relevant implementation file. +- If the user explicitly says they already have an **Agora Studio Agent ID** from `https://console.agora.io/studio/agents`, treat that as a dedicated ConvoAI path rather than re-running the provider-choice flow. +- If the user needs Java, Ruby, PHP, C#, or another non-SDK backend language, use [auth-flow.md](auth-flow.md) after the quickstart path is chosen. ## SDK vs. Direct REST API @@ -51,14 +65,21 @@ ASR → LLM → TTS Receives audio + transcripts ## Documentation Lookup The bundled references in this file cover gotchas, generation rules, and the stable -behavioral contracts. For content that changes with doc updates, use Level 2: +behavioral contracts. Read the relevant local ConvoAI reference first, then use Level 2 only +if the local file does not cover the detail needed. + +For vendor/provider questions, use the official current provider docs as the source of truth +once the question moves beyond the default quickstart combo. The bundled quickstart references +are still the right source for the first-success default path, but the current provider matrix, +vendor availability, beta status, and vendor-specific configs should come from live docs. + +For content that still needs live docs after the local check, use Level 2: 1. Fetch `https://docs.agora.io/en/llms.txt` 2. Scan for a URL matching your topic (e.g., `conversational-ai`, `quick-start`, `rest-api`) 3. Fetch that URL -Common topics to fetch via Level 2: quick-start code (Python, Go, Java), TTS/ASR/LLM -vendor configs, error code listings. +Common topics to fetch via Level 2 after the local reference check: quick-start code (Python, Go, Java), provider matrices, vendor-specific configs, full request/response schemas, newly changed vendor configs, error code listings. For full request/response schemas, fetch the OpenAPI spec directly — it is always current and covers every endpoint and field: @@ -135,6 +156,7 @@ Things the official docs don't emphasize that cause frequent mistakes: - **Use token auth as the default for new direct REST integrations.** The ConvoAI REST API accepts `Authorization: agora token=` using a combined RTC + RTM token from `RtcTokenBuilder.buildTokenWithRtm`. This is **safer than Basic Auth**: tokens are scoped to a single App ID + channel, while Customer ID/Secret grants access to every project on the account. Use Basic Auth only when a user explicitly needs that mode. - **POST `/join` success does not mean the agent is already in the RTC channel** — the request was accepted and the agent is starting. The client should wait for the RTC `user-joined` event before expecting agent audio or querying media state. - **`/update` overwrites `params` entirely** — sending `{ "llm": { "params": { "max_tokens": 2048 } } }` erases `model` and everything else in `params`. Always send the full object. +- **Agora Studio Agent ID is not the same thing as the runtime `agent_id` returned by `/join`** — the Studio Agent ID comes from the Studio Agents page and identifies a Studio-managed agent configuration. In the Studio-managed start path, that value maps to the request field `pipeline_id`. The runtime `agent_id` identifies a started live session returned by the REST API. Do not use one in place of the other. - **`/speak` priority enum** — `"INTERRUPT"` (immediate, default), `"APPEND"` (queued after current speech), `"IGNORE"` (skip if agent is busy). `interruptable: false` prevents users from cutting in. - **20 PCU default limit** — max 20 concurrent agents per App ID. Exceeding returns error on `/join`. Contact Agora support to increase. - **Event notifications require two flags** — `advanced_features.enable_rtm: true` AND `parameters.data_channel: "rtm"` in the join config. Without both, `onAgentStateChanged`/`onAgentMetrics`/`onAgentError` won't fire. Additionally: `parameters.enable_metrics: true` for metrics, `parameters.enable_error_message: true` for errors. @@ -164,10 +186,12 @@ Use the file that matches what the user is building: | User's question / task | Read this file | |---|---| -| Starting a new project — which repo to clone, setup, env vars | [quickstarts.md](quickstarts.md) | +| No working ConvoAI baseline yet — choose the baseline path, setup order, and readiness gates | [quickstarts.md](quickstarts.md) | | Node.js/Python/Go backend — starting agent, auth, session lifecycle | [server-sdks.md](server-sdks.md) | | Python SDK specifics (async, deprecations, debug) | [python-sdk.md](python-sdk.md) | | Go SDK specifics (context, builder, status constants) | [go-sdk.md](go-sdk.md) | +| Supported vendors and current vendor-specific configs | Fetch the official ConvoAI provider docs after reading this file | +| Existing Agora Studio Agent ID from `console.agora.io/studio/agents` | [conversational-ai-studio.md](conversational-ai-studio.md) | | Auth flow, token types, direct REST API (non-SDK languages) | [auth-flow.md](auth-flow.md) | | Full working demo app architecture, profiles, MLLM/Gemini | [agent-samples.md](agent-samples.md) | | Web/React client: transcripts, agent state, sendText, interrupt | [agent-toolkit.md](agent-toolkit.md) | diff --git a/skills/agora/references/conversational-ai/conversational-ai-studio.md b/skills/agora/references/conversational-ai/conversational-ai-studio.md new file mode 100644 index 0000000..67ef31c --- /dev/null +++ b/skills/agora/references/conversational-ai/conversational-ai-studio.md @@ -0,0 +1,170 @@ +# ConvoAI Studio Agent ID + +Use this file when the user already has an **Agora Studio Agent ID** from: + + + +This is the Agora-version analogue of a preconfigured agent path: the Studio-managed agent +configuration already exists, so quickstart should avoid rebuilding the provider stack from +scratch unless the user explicitly asks to replace it. + +## What It Is + +- **Studio Agent ID**: identifies an agent configuration created or managed in Agora Studio. +- **Runtime `agent_id`**: identifies a live started agent session returned by the ConvoAI REST API. +- **Request field mapping**: when reusing an Agora Studio-managed agent in the start flow, pass the Studio Agent ID via the request field `pipeline_id`. + +These are **not** interchangeable. + +## When to Use This Path + +Use the Studio Agent ID path when: + +- the user explicitly says they already have an Agent ID from the Studio Agents page +- the user wants to reuse an agent configured in Studio +- the user does not want to re-enter STT / LLM / TTS provider details during quickstart + +Do **not** use this path when: + +- the user only has the runtime `agent_id` returned by `/join` +- the user still needs to choose or build the provider stack from scratch + +## Quickstart Rules + +If the Studio Agent ID path is chosen: + +1. Treat Agora Studio as the source of truth for the agent configuration. +2. Do not re-ask provider-vendor questions unless the user explicitly wants to replace the Studio-managed config. +3. Keep the client and auth path aligned with the chosen quickstart baseline (`full-stack-nextjs`, `separate-backend-frontend`, or `existing-app-integration`). +4. Use the Studio Agent ID as `pipeline_id` in the request body. +5. Before generating exact request code, still verify the current official ConvoAI docs for any other request-shape changes. Do not fabricate undocumented fields beyond the confirmed `pipeline_id` mapping. + +## Goal + +Reuse the official sample repo as the structural baseline, but replace the default provider-selection path with the user's existing Studio-managed agent path. + +This flow is for implementation after quickstart confirms the user already has a Studio Agent ID. Do not send the user back to provider selection unless they explicitly want to replace the Studio-managed config. + +## User-Facing Guidance + +Suggested explanation: + +```text +If you already configured the agent in Agora Studio, we can treat Studio as the source of truth for the agent configuration and avoid rebuilding the provider stack from scratch here. + +Open `https://console.agora.io/studio/agents`, find the agent you want to reuse, and copy its Agent ID. +``` + +## Request Shape Rule + +For this Studio path, use the same request-field convention as the parallel preconfigured-agent flow: + +- copy the **Agent ID** from `https://console.agora.io/studio/agents` +- pass that value as `pipeline_id` in the request body + +In other words: + +- **Studio UI name**: `Agent ID` +- **Request field name**: `pipeline_id` + +## Current Request Shape + +Current fixed request shape for the Studio Agent ID path: + +```text +POST https://api.agora.io/api/conversational-ai-agent/v2/projects/{AGORA_APP_ID}/join +Authorization: agora token={RTC_HEADER_TOKEN} +Content-Type: application/json + +{ + "name": "{channel}", + "pipeline_id": "{AGORA_STUDIO_AGENT_ID}", + "properties": { + "agent_rtc_uid": "{agent_rtc_uid}", + "channel": "{channel}", + "remote_rtc_uids": ["*"], + "token": "{RTC_AGENT_TOKEN}" + } +} +``` + +Field mapping rules: + +| Request field | Source | +|---|---| +| URL project segment | Existing `AGORA_APP_ID` | +| `Authorization` header token | RTC token generated with the caller/user UID using the sample's existing token-generation path | +| `name` | Same value as `channel` | +| `pipeline_id` | `AGORA_STUDIO_AGENT_ID` copied from `https://console.agora.io/studio/agents` | +| `properties.agent_rtc_uid` | Runtime RTC UID string | +| `properties.channel` | Runtime channel value | +| `properties.remote_rtc_uids` | `["*"]` unless the user asks for specific UIDs | +| `properties.token` | Separate RTC token generated with the agent UID using the sample's existing token-generation path | + +Do not reintroduce the old provider-based `llm`, `tts`, or `asr` request blocks in this Studio-managed path. +Do not reuse one RTC token for both the `Authorization` header and `properties.token`. + +## Expected Success Response + +Code generated for this flow should expect and preserve the standard live-agent response fields: + +- `agent_id` +- `create_ts` +- `status` + +## Minimum Contract + +For the Studio-managed path, the skill may assume: + +- the Studio Agent ID value is supplied by the user +- that value maps to `pipeline_id` +- the runtime `agent_id` is still returned by the live start/join flow and must not be confused with the Studio Agent ID + +## Env and Config Rules + +- Prefer `AGORA_STUDIO_AGENT_ID` as the config key / placeholder name in code or env templates. +- Reuse the sample's existing `AGORA_APP_ID` and token-generation path. +- Remove or bypass provider-only config when it is only used for the old three-stage selection flow. +- Keep the sample's existing config style; do not invent a second config-loading layer just for `AGORA_STUDIO_AGENT_ID`. +- Treat `AGORA_STUDIO_AGENT_ID` as a user-filled config value. Add the placeholder, but do not ask the user to paste the live value into the conversation after it has already been identified. + +## Implementation Guardrails + +- Do not confuse the Studio Agent ID with the runtime `agent_id`. +- Do not replace `pipeline_id` with `agent_id` in request generation. +- Do not hardcode the `Authorization` token or the RTC `properties.token`; both must come from the runtime token path. +- Before code generation, fetch the current official ConvoAI docs / OpenAPI and verify: + - any other required body fields around `pipeline_id` + - whether Studio-managed agents require additional prerequisites or restrictions + - the current response shape + +## Implementation Workflow + +1. Keep the chosen quickstart baseline (`full-stack-nextjs`, `separate-backend-frontend`, or `existing-app-integration`) as the structural baseline. +2. Inspect the repo's actual env/config files and the current request path that starts the agent. +3. Replace only the provider-selection-specific request/config path with the fixed Studio Agent ID request shape in this file. +4. Generate the new request code from the fixed shape above, preserving: + - `POST /api/conversational-ai-agent/v2/projects/{appId}/join` + - `Authorization: agora token=...` + - JSON body with `name`, `pipeline_id`, and `properties` +5. Map dynamic fields to runtime/config sources: + - `name` → same value as `channel` + - `pipeline_id` → `AGORA_STUDIO_AGENT_ID` + - URL project segment → `AGORA_APP_ID` + - `Authorization` header token → RTC token generated with the caller/user UID + - `properties.token` → separate RTC token generated with the agent UID + - `channel` / `agent_rtc_uid` → runtime values +6. Parse and preserve `agent_id`, `create_ts`, and `status` from the response. +7. Keep the rest of the repo structure and RTC/UI flow as close to the sample as possible. + +## After This Step + +Once the Studio Agent ID is collected: + +- keep quickstart in the selected baseline path +- use the official current ConvoAI docs to verify the exact start flow +- then continue with the appropriate backend/client reference: + - [server-sdks.md](server-sdks.md) + - [python-sdk.md](python-sdk.md) + - [go-sdk.md](go-sdk.md) + - [agent-samples.md](agent-samples.md) diff --git a/skills/agora/references/conversational-ai/python-sdk.md b/skills/agora/references/conversational-ai/python-sdk.md index eed7f72..20e3cd2 100644 --- a/skills/agora/references/conversational-ai/python-sdk.md +++ b/skills/agora/references/conversational-ai/python-sdk.md @@ -8,7 +8,7 @@ description: | license: MIT metadata: author: agora - version: '1.0.0' + version: '1.1.0' --- # ConvoAI Server SDK — Python @@ -71,6 +71,8 @@ async def main(): asyncio.run(main()) ``` +For the first-success default combo, use the quickstart guidance in [quickstarts.md](quickstarts.md). For the current provider matrix and vendor-specific configuration details, use the official live ConvoAI provider docs rather than maintaining a local copy in this SDK usage file. + ## Naming Conventions All method names are snake_case — same API surface as TypeScript but with Python naming: diff --git a/skills/agora/references/conversational-ai/quickstarts.md b/skills/agora/references/conversational-ai/quickstarts.md index 1034814..30401a7 100644 --- a/skills/agora/references/conversational-ai/quickstarts.md +++ b/skills/agora/references/conversational-ai/quickstarts.md @@ -1,43 +1,331 @@ --- name: conversational-ai-quickstarts description: | - Quickstart repos for building Agora Conversational AI agents. Use when the user is starting - a new ConvoAI project and needs a working baseline to clone. Covers two paths: full-stack - Next.js (agent-quickstart-nextjs) and separate Python backend + React frontend - (conversational-ai-quickstart, private repo). Always direct users to clone one of these - before building from scratch. + Locked quickstart flow for Agora Conversational AI when the user does not yet have a proven + working baseline. Use for new projects and integrations where the ConvoAI path has not been + proven end to end. Restricts the model to one decision group per turn: baseline path, + project readiness, then backend path only if needed. Do not generate code or custom + architecture before the quickstart gates are resolved. license: MIT metadata: author: agora - version: '1.0.0' + version: '1.1.0' --- -# Conversational AI Quickstarts +# Conversational AI Quickstart -**Always start here when building a new Conversational AI agent.** Clone one of the repos below — do not build from scratch. +Use this file for `quickstart` and `integration` mode from [README.md](README.md). -## Choose Your Path +## Working-Baseline Rule -| I want to... | Use | -|---|---| -| Build a full-stack app in a single repo (Next.js API routes + React UI) | **Path A — agent-quickstart-nextjs** | -| Build a separate Python backend with a standalone React frontend | **Path B — conversational-ai-quickstart** (private) | +A **working ConvoAI baseline** means the developer has already started an Agora ConvoAI agent successfully and the client can join the same RTC channel and interact with it. ---- +The following do **not** count as a working baseline: + +- only RTC code exists +- a sample repo is cloned but the agent has never started successfully +- environment variables are present but unverified +- the user only knows the desired backend language or framework + +If the user already has a working baseline, exit this file and route back through [README.md](README.md). + +## Sequence + +Follow this exact user-visible order: + +1. Product intro in plain language +2. Baseline-path confirmation +3. Project-readiness checkpoint +4. Vendor-path confirmation +5. Vendor selection, only if the user asks for the current provider list or chooses a non-default path +6. Studio Agent ID confirmation, only if the user wants to reuse an agent configured in Agora Studio +7. Backend-path confirmation, only if a separate backend or existing-repo integration still needs it +8. Structured quickstart spec + +## Interaction Rules + +- One decision group per turn. Do not ask baseline, credentials, and backend path in the same reply. +- Skip anything the user already answered. +- Infer obvious context from the user's stack or repository description. +- Mirror the user's language. +- While quickstart is unresolved, do **not** generate `/join` payloads, SDK code, custom file structures, clone commands, or repo adaptation plans. +- While quickstart is unresolved, read only this file and [README.md](README.md). +- Existing-app requests stay in quickstart until the ConvoAI path is proven once. +- Unless the user explicitly asks for a different provider stack or MLLM path, anchor on the Python SDK's documented first-success cascading combo first. +- If `baseline_path=full-stack-nextjs`, keep the official sample's env var names. Do **not** rename them to generic provider-reference placeholders during quickstart. +- For non-default provider selection, fetch the official current provider docs before confirming support or generating config details. +- If the user already has an **Agora Studio Agent ID** from `https://console.agora.io/studio/agents`, treat that as a separate quickstart branch. Do not re-ask STT/LLM/TTS provider choices unless the user explicitly wants to replace the Studio-managed config. + +## First-Success Vendor Defaults + +Use the current official Python SDK examples as the default provider policy for quickstart: + +- **STT default:** `DeepgramSTT(api_key=..., language="en-US")` +- **LLM default:** `OpenAI(api_key=..., model="gpt-4o-mini")` +- **TTS default:** `ElevenLabsTTS(key=..., model_id="eleven_flash_v2_5", voice_id=..., sample_rate=24000)` + +Documented provider families visible in the current Python SDK docs: + +- **STT:** Deepgram +- **LLM:** OpenAI +- **TTS:** ElevenLabs, Microsoft +- **MLLM:** OpenAI Realtime, Google Gemini Live + +Use this rule during quickstart: + +- For the first end-to-end success path, prefer **Deepgram + OpenAI + ElevenLabs**. +- Only switch away from the default combo during quickstart if the user explicitly names another provider path or explicitly asks for MLLM. +- For the current provider matrix or vendor-specific configs, fetch the official live docs before claiming support or listing parameters. + +## Env Name Policy + +Use different rules depending on whether the user is staying sample-aligned or generating custom code. + +### Sample-aligned path (`full-stack-nextjs`) + +Keep the official sample's env names as the source of truth: + +```bash +AGORA_APP_ID= +AGORA_APP_CERTIFICATE= +LLM_URL= +LLM_API_KEY= +DEEPGRAM_API_KEY= +ELEVENLABS_API_KEY= +ELEVENLABS_VOICE_ID= +``` + +The provider defaults still apply, but they should map onto the sample's existing config shape rather than inventing `OPENAI_*` or `ELEVENLABS_MODEL_ID` variables during quickstart. + +### Custom-code path + +If the user is no longer sample-aligned and needs provider-specific config layout, fetch the current official ConvoAI provider docs and use those as the source of truth. + +## Baseline Paths + +| Path | Use when | After quickstart completes | +|---|---|---| +| `full-stack-nextjs` | Best default for a new project or prototype | Continue with Path A in this file, then use [agent-toolkit.md](agent-toolkit.md) or [server-sdks.md](server-sdks.md) for customization | +| `separate-backend-frontend` | The user explicitly wants a separate server and client | Use Path B if Python fits, or combine [agent-samples.md](agent-samples.md) with the backend file chosen later | +| `existing-app-integration` | The user already has an app or repo, but ConvoAI is not working yet | Keep the implementation sample-aligned; use [agent-samples.md](agent-samples.md) plus the backend file chosen later | + +## State Machine + +The quickstart is a blocking state machine. While a state is unresolved, the only allowed action is to send the next prompt for that state and wait for the user's reply. + +| State | Allowed | Forbidden | Next prompt | Advance when | +|---|---|---|---|---| +| `intro` | Give a short plain-language intro to what ConvoAI is | Code, repo plans, framework recommendations | Product intro text | Intro delivered | +| `baseline_path` | Ask which baseline path to use | Code, clone steps, provider discussions | Baseline-path prompt | User picks A/B/C or gives equivalent clear context | +| `project_readiness` | Ask about App ID, App Certificate, and ConvoAI activation | Code, repo inspection, backend implementation | Readiness prompt | User confirms ready or asks where to find them | +| `vendor_defaults` | Ask whether to use the default combo, show the current official provider list, choose a non-default cascading / MLLM path, or reuse a Studio Agent ID | Code, implementation | Vendor-defaults prompt | User picks A/B/C/D or directly names a provider path / Studio Agent ID path | +| `vendor_selection` | Collect only provider-mode and provider choices after checking the official current provider docs | Code, implementation, secret collection | Custom-provider prompt | Provider mode and provider names are resolved | +| `studio_agent_id` | Collect the Agora Studio Agent ID and confirm the user wants Studio to remain the source of truth for agent config | Code, re-asking provider setup from scratch | Studio-Agent-ID prompt | The Studio Agent ID path is resolved | +| `backend_path` | Ask for backend path only if still needed | Code, detailed implementation | Backend-path prompt | Backend path is clear or no longer needed | +| `complete` | Emit structured spec and continue to the mapped reference file | Re-open resolved gates | None | Spec emitted | + +### Pre-Action Self-Check + +Before every tool call or user-visible reply: + +1. What is the current state? +2. Is the intended action allowed in that state? +3. If not, send the state prompt instead. + +### Failure Branches + +- If the user says they cloned a repo but never got an agent running, stay in quickstart. +- If the user asks for code before quickstart resolves, answer with the next gate instead of generating code. +- If a reply only partially resolves the current gate, ask a narrow follow-up for the missing field only. +- If the user names a provider that is not in the current official provider docs, say this clearly: it is **not currently documented as supported in the official Agora ConvoAI provider docs**, so do not proceed as if it is supported. Offer the documented default combo or a live-doc verification path. +- If the user asks to see the provider list, fetch the current official provider docs and stay in the vendor gate until they accept the default combo or choose a documented alternative. +- If the user says they already have an Agora Studio Agent ID, switch to the `studio_agent_id` state and stop re-asking provider-vendor questions unless they explicitly say they want to replace the Studio-managed config. +- If the user changes the baseline-path assumption later (for example, picks Path A first and later insists on a separate Python backend), return to `baseline_path` and re-confirm instead of silently drifting paths. +- If the user chooses Path B but does not have access to the private repo, keep the quickstart state intact and continue with the public `agent-samples` fallback only after stating that the private baseline is unavailable. + +## Prompt Templates + +### Product Intro + +Keep it short. Explain that ConvoAI is a server-managed voice agent that joins an RTC channel, speaks through TTS, and usually pairs an RTC client with a backend that starts the agent. + +Use a natural transition into quickstart. Preferred tone: + +- Avoid saying "run the baseline flow" or "anchor on a proven baseline" to the user. +- Prefer "let's first use the official sample to get the whole link working once" language. + +Suggested transition line: + +```text +Before we jump into custom code, let's first use the official sample to get the whole flow working once. Once the agent can join the channel and finish one real conversation, we can turn that working version into your demo. +``` + +### Baseline Path + +```text +Before we customize anything, let's first use an official sample to get the full ConvoAI flow working once: +A. Use the official full-stack Next.js quickstart +B. Use a separate backend + frontend baseline (private Python repo if you have access; otherwise we will fall back to the public decomposed sample) +C. Adapt an existing app/repo, but keep it sample-aligned until ConvoAI works end to end +``` + +### Project Readiness + +```text +Before we continue, confirm these prerequisites: +- App ID: your Agora project identifier +- App Certificate: required for production-safe token generation +- ConvoAI activation: the project must have Conversational AI enabled + +A. All ready +B. Not yet — tell me where to get them +``` + +### Backend Path + +Use only if the baseline path still leaves the backend unclear. + +```text +Which backend path should we optimize for after the baseline is chosen? +A. TypeScript / Node.js +B. Python +C. Go +D. Another backend language / direct REST +``` + +### Vendor Defaults + +Use this after readiness unless the user has already made the provider choice obvious. + +```text +For the fastest first successful run, stay on the Python SDK's documented default combo: +- STT: Deepgram (`language="en-US"`) +- LLM: OpenAI (`model="gpt-4o-mini"`) +- TTS: ElevenLabs (`model_id="eleven_flash_v2_5"`, `sample_rate=24000`, plus a `voice_id`) + +Other provider paths explicitly shown in the current Python SDK docs: +- TTS: Microsoft +- MLLM: OpenAI Realtime, Google Gemini Live + +A. Use the default combo +B. Show me the current official provider list first +C. I want to choose a non-default cascading or MLLM path +D. I already have an Agora Studio Agent ID and want to reuse that Studio-managed agent +``` + +### Custom Provider Prompt + +Use only after the user picks `C` or directly asks for non-default providers. + +```text +First check the current official ConvoAI provider docs, then choose from the documented provider modes: +- Cascading path: STT + LLM + TTS +- MLLM path: OpenAI Realtime or Google Gemini Live + +Then choose the documented providers for that mode using the current official docs as the source of truth. + +Reply in one line, for example: +- `TTS: Microsoft` +- `MLLM: OpenAI Realtime` +- `STT: Deepgram, LLM: OpenAI, TTS: Microsoft` +``` + +### Studio Agent ID Prompt + +Use only when the user picks `D` or directly says they already have an Agora Studio Agent ID. + +```text +If you already configured the agent in Agora Studio, we can treat Studio as the source of truth for the agent configuration instead of rebuilding the provider stack here. + +Open `https://console.agora.io/studio/agents`, find the agent you want to reuse, and copy its **Agent ID**. + +Important: +- This **Studio Agent ID** is different from the runtime `agent_id` returned by `/join`. +- The Studio Agent ID identifies the Studio-managed agent configuration and maps to the request field `pipeline_id`. +- The runtime `agent_id` identifies a live started session. + +Reply with one of these: +A. I have the Studio Agent ID — here it is: `` +B. I need to look it up in Studio first +C. Go back — I want to use the default/provider path instead +``` + +### Unsupported Provider Prompt + +Use this when the user names a provider that is not in the current official provider docs. + +```text +That provider is not in the current official Agora ConvoAI provider docs, so I should not proceed as if it is supported. + +You can choose one of these paths: +A. Use the documented default combo to get the first demo working +B. Show the current official provider list first +C. Re-check the latest official docs to verify whether that provider is supported now +``` + +## Output: Structured Quickstart Spec + +After all gates are resolved, normalize the result into a short spec and continue automatically. + +```yaml +use_case: [text] +mode: [quickstart | integration] +proven_working_baseline: no +baseline_path: [full-stack-nextjs | separate-backend-frontend | existing-app-integration] +frontend: [nextjs-react | standalone-react | existing-app | unknown] +backend: [typescript-node | python | go | direct-rest | unknown] +project_readiness: + app_id: [ready | missing | unknown] + app_certificate: [ready | missing | unknown] + convoai_activation: [ready | missing | unknown] +providers: + pipeline: [cascading | mllm | unknown] + stt: [deepgram | user-specified-supported | unknown] + llm: [openai | user-specified-supported | unknown] + tts: [elevenlabs | microsoft | user-specified-supported | unknown] + mode: [cascading-default | user-specified-cascading | mllm | unknown] +studio_agent: + use_existing_agent_id: [yes | no | unknown] + agent_id: [text | missing | unknown] +config_style: [sample-aligned | custom-path | unknown] +``` + +Notes: + +- `stt` is the SDK-facing name in this quickstart spec. Platform docs may call the same stage `ASR`. +- `studio_agent.agent_id` means the **Agora Studio Agent ID** from `https://console.agora.io/studio/agents`, not the runtime `agent_id` returned by `/join`. +- When this Studio path is used, that Studio Agent ID maps to the request field `pipeline_id`. +- `AGORA_STUDIO_AGENT_ID` is the preferred config placeholder name for this path; the request field remains `pipeline_id`. +- `config_style` is derived from the chosen baseline path unless the user explicitly overrides it later: + `full-stack-nextjs` and `existing-app-integration` usually imply `sample-aligned`; + fully custom provider configuration usually implies `custom-path`. + +## After Collection + +Route according to the completed spec: + +- `full-stack-nextjs` → stay in Path A below. Use [server-sdks.md](server-sdks.md) only if the user later customizes the server-side API routes. +- `separate-backend-frontend` + `python` → use Path B below, then [python-sdk.md](python-sdk.md) for backend details. +- `separate-backend-frontend` + `typescript-node` → use [agent-samples.md](agent-samples.md) for the decomposed app shape, then [server-sdks.md](server-sdks.md). +- `separate-backend-frontend` + `go` → use [agent-samples.md](agent-samples.md) for client structure, then [go-sdk.md](go-sdk.md). +- existing Agora Studio Agent ID → use [conversational-ai-studio.md](conversational-ai-studio.md). +- provider selection or parameter confirmation → fetch the current official ConvoAI provider docs. +- `direct-rest` → use [auth-flow.md](auth-flow.md). +- `existing-app-integration` → keep changes sample-aligned with [agent-samples.md](agent-samples.md) until the first successful end-to-end ConvoAI session. ## Path A — Full-Stack Next.js (Default) **Repo:** -Single Next.js application covering everything: token generation, agent lifecycle API routes, and the React UI. Best starting point for most projects. +Single Next.js application covering token generation, agent lifecycle API routes, and the React UI. This is the best starting point for most new projects. -> **Note:** Agora SDKs are browser-only. Because this is a Next.js app, follow the SSR patterns in **[references/rtc/nextjs.md](../rtc/nextjs.md)** if you add custom RTC components — `next/dynamic` with `ssr: false` requires extra steps in Next.js 14+ Server Components. +> **Note:** Agora SDKs are browser-only. If you add custom RTC components, follow the SSR patterns in [../rtc/nextjs.md](../rtc/nextjs.md). -> **agent-quickstart-nextjs vs. agent-samples**: `agent-quickstart-nextjs` is a single self-contained Next.js app with API routes (no separate server process). `agent-samples` is a multi-repo monorepo with a separate Python Flask backend and Next.js React clients — use it if you need a Python server or want to study a more decomposed architecture. See [agent-samples.md](agent-samples.md). +> **agent-quickstart-nextjs vs. agent-samples**: `agent-quickstart-nextjs` is a single self-contained Next.js app with API routes. `agent-samples` is a decomposed baseline with a separate backend and client apps. Use [agent-samples.md](agent-samples.md) only when the quickstart spec requires that structure. ### What's Included -- Next.js API routes for token generation (`/api/generate-agora-token`), starting (`/api/invite-agent`), and stopping (`/api/stop-conversation`) agents +- Next.js API routes for token generation (`/api/generate-agora-token`), agent start (`/api/invite-agent`), and agent stop (`/api/stop-conversation`) - React UI with live transcription, audio visualization, device selection, and mobile-responsive chat - `agora-agent-uikit`, `agora-agent-client-toolkit`, and `agora-agent-server-sdk` pre-wired - Dual RTC + RTM token auth @@ -50,7 +338,7 @@ Single Next.js application covering everything: token generation, agent lifecycl - **Real-time:** Agora RTC + RTM - **ASR:** Deepgram - **TTS:** ElevenLabs -- **LLM:** OpenAI-compatible endpoint (OpenAI, Anthropic, etc.) +- **LLM:** OpenAI-compatible endpoint (OpenAI, Anthropic, and similar providers) ### Setup @@ -87,30 +375,29 @@ ELEVENLABS_VOICE_ID= ``` > The App Certificate is required for token generation. Get both from [Agora Console](https://console.agora.io). - ---- +> When staying on this sample-aligned path, do not rename these env vars to a different custom provider env scheme during quickstart. ## Path B — Python Backend + React Frontend (Private Repo) **Repo:** *(private — contact your Agora developer relations or solutions engineer contact to request access)* -Use this when you need a separate Python backend and a standalone React frontend deployed independently. +Use this when you specifically need a separate Python backend and a standalone React frontend deployed independently. - Python backend handles token generation and agent lifecycle via the ConvoAI REST API - React frontend connects via RTC + RTM - Refer to the repo README for setup once you have access ---- +If access to the private repo is unavailable, keep the quickstart spec and fall back to [agent-samples.md](agent-samples.md) for the public decomposed baseline, then use [python-sdk.md](python-sdk.md) for backend behavior. -## After Cloning +## After the Baseline Works -Once the baseline is running (applies to both paths — Path B users should substitute their Python backend's equivalent for any server-side steps): +Once the first end-to-end ConvoAI session works, route by task: | Next step | Reference | |---|---| -| Customize LLM, TTS, ASR vendor/model | Fetch `https://docs-md.agora.io/en/conversational-ai/develop/custom-llm.md` | -| Add transcript rendering / agent state to a custom UI | [agent-toolkit.md](agent-toolkit.md) | -| Use React hooks (useTranscript, useAgentState) | [agent-client-toolkit-react.md](agent-client-toolkit-react.md) | +| Customize LLM, TTS, ASR vendor or model | Fetch `https://docs-md.agora.io/en/conversational-ai/develop/custom-llm.md` | +| Add transcript rendering or agent state to a custom UI | [agent-toolkit.md](agent-toolkit.md) | +| Use React hooks (`useTranscript`, `useAgentState`) | [agent-client-toolkit-react.md](agent-client-toolkit-react.md) | | Swap in pre-built React UI components | [agent-ui-kit.md](agent-ui-kit.md) | | Add a custom LLM backend (RAG, tool calling) | [server-custom-llm.md](server-custom-llm.md) | | Production token generation | [../server/tokens.md](../server/tokens.md) | diff --git a/tests/eval-cases.md b/tests/eval-cases.md index d1229c6..ae8da08 100644 --- a/tests/eval-cases.md +++ b/tests/eval-cases.md @@ -43,11 +43,11 @@ For each case: - Pass Criteria: References `agora-rtc-react` or `AgoraRTCProvider` - Result: ___ -### R-05: ConvoAI Python +### R-05: ConvoAI Python without a proven baseline - User Input: "ConvoAI agent in Python" -- Expected Behavior: Routes to `references/conversational-ai/README.md` -- Pass Criteria: References ConvoAI REST API; does not reference RTC SDK directly as the primary API +- Expected Behavior: Routes to `references/conversational-ai/README.md`, classifies the request as quickstart/integration, and enters `quickstarts.md` +- Pass Criteria: Does not jump straight to `/join` or SDK code; asks for the next quickstart decision or anchors on an official baseline first - Result: ___ ### R-06: Server-side token generation @@ -71,6 +71,34 @@ For each case: - Pass Criteria: Does not confuse Cloud Recording with RTC local recording; references REST API pattern - Result: ___ +### R-09: Working baseline skips quickstart + +- User Input: "My ConvoAI agent already starts successfully; now help me add transcript rendering in React" +- Expected Behavior: Skips `quickstarts.md` and routes to the relevant React client reference +- Pass Criteria: Does not re-ask baseline or readiness questions; references `agent-toolkit.md` or `agent-client-toolkit-react.md` +- Result: ___ + +### R-10: Supported vendor query routes to provider reference + +- User Input: "What providers does Agora ConvoAI support for STT, LLM, and TTS?" +- Expected Behavior: Routes to `references/conversational-ai/README.md`, then uses the official current provider docs as the source of truth +- Pass Criteria: Starts from the local ConvoAI module, but uses live docs for the current provider matrix instead of inventing or relying on a stale local copy +- Result: ___ + +### R-11: MLLM request routes to ConvoAI before intake + +- User Input: "I want MLLM with Gemini" +- Expected Behavior: Routes directly to `references/conversational-ai/README.md` +- Pass Criteria: Does not go through `intake/SKILL.md` first when the request is already clearly ConvoAI-specific +- Result: ___ + +### R-12: Studio Agent ID request routes to ConvoAI before intake + +- User Input: "I already have an Agent ID from Agora Studio Agents" +- Expected Behavior: Routes directly to `references/conversational-ai/README.md`, then into the ConvoAI quickstart Studio Agent ID branch +- Pass Criteria: Does not go through `intake/SKILL.md` first when the request is already clearly ConvoAI-specific +- Result: ___ + --- ## 2. Code Generation Quality (C-series) @@ -156,6 +184,18 @@ For each case: - Expected Behavior: Presents token-based auth as the default; does not default to Basic Auth (Customer ID + Secret) without being asked - Pass Criteria: `Authorization: agora token=` pattern appears in the primary example; Basic Auth shown as an alternative only +### C-13: Quickstart vendor defaults come from the Python SDK + +- User Input: "I want the fastest way to get ConvoAI working" +- Expected Behavior: Quickstart anchors on the documented Python SDK first-success combo +- Pass Criteria: Mentions Deepgram STT with `language=\"en-US\"`, OpenAI LLM with `model=\"gpt-4o-mini\"`, and ElevenLabs TTS with `model_id=\"eleven_flash_v2_5\"` plus `sample_rate=24000` + +### C-14: Sample-aligned env names are preserved + +- User Input: "Use the official full-stack Next.js quickstart" +- Expected Behavior: Keeps the official sample env names instead of inventing provider-placeholder env vars +- Pass Criteria: Uses `LLM_API_KEY` / `LLM_URL` for the sample-aligned path; does not replace them with `OPENAI_API_KEY` / `OPENAI_BASE_URL` + --- ## 3. Failure Paths (F-series) @@ -201,6 +241,12 @@ For each case: - Expected Behavior: Clarifies that Cloud Recording is REST API only — there is no client SDK; describes the acquire/start/stop REST API pattern - Pass Criteria: Does not fabricate a "Cloud Recording SDK" package or import; routes to `references/cloud-recording/README.md` +### F-07: No quickstart bypass into `/join` payload generation + +- User Input: "Generate the ConvoAI /join payload for my new project" +- Expected Behavior: Enters the ConvoAI quickstart flow unless a working baseline is already confirmed +- Pass Criteria: Does not generate a `/join` payload before the baseline path and readiness gates are resolved + --- ## 4. Intake Accuracy (I-series) @@ -219,11 +265,11 @@ For each case: - Pass Criteria: Needs analysis includes both products - Result: ___ -### I-03: ConvoAI fast-path with context provided +### I-03: Partial ConvoAI context still stays in quickstart - User Input: "Help me integrate ConvoAI with OpenAI, Python backend, I have my credentials" -- Expected Behavior: Fast-path to ConvoAI skill; skip full intake questions since key details are already provided -- Pass Criteria: Does not ask Q1/Q2/Q3 one by one; routes directly using the provided context +- Expected Behavior: Enters the ConvoAI quickstart flow, skips already-known fields, and asks only the next unresolved quickstart decision +- Pass Criteria: Does not generate code; does not ask a long multi-step interview; asks only for the baseline path or equivalent next gate - Result: ___ ### I-04: Clear RTC request — no intake @@ -233,6 +279,97 @@ For each case: - Pass Criteria: Intake flow is not entered; confirms the routing non-regression for experienced developers - Result: ___ +### I-05: Cloned repo is not a working baseline + +- User Input: "I cloned agent-quickstart-nextjs, but the ConvoAI agent has never connected" +- Expected Behavior: Treats this as `integration`, not a completed baseline +- Pass Criteria: Stays in the ConvoAI quickstart flow; does not skip directly to advanced implementation guidance +- Result: ___ + +### I-06: Working baseline can skip quickstart + +- User Input: "Our ConvoAI baseline already works; help me add useTranscript in React" +- Expected Behavior: Skips quickstart and routes directly to React client references +- Pass Criteria: Does not ask baseline-path or readiness questions; references the client toolkit or React hooks docs +- Result: ___ + +### I-07: Quickstart recaps the default vendor combo + +- User Input: "I want to start a new ConvoAI project with the safest default path" +- Expected Behavior: Quickstart includes the documented default provider combo instead of inventing one +- Pass Criteria: Uses the Python SDK-backed default combo; does not invent unsupported vendors or omit the key default parameters +- Result: ___ + +### I-08: Vendor-list question uses the dedicated file + +- User Input: "Before we write code, tell me which providers are supported right now" +- Expected Behavior: Uses the local ConvoAI module first, then answers from the official current provider docs +- Pass Criteria: Does not invent a local-only provider list when the user is explicitly asking what is supported right now +- Result: ___ + +### I-09: Vendor gate uses explicit branching + +- User Input: "I have the credentials. What provider path should I take?" +- Expected Behavior: The vendor step offers a clear default / show-list / choose-custom branch +- Pass Criteria: The prompt includes A/B/C-style branching for default combo, current official provider list, and non-default provider choice +- Result: ___ + +### I-10: Vendor gate distinguishes cascading vs MLLM + +- User Input: "I want MLLM with Gemini" +- Expected Behavior: The vendor-selection step treats this as an MLLM path, not just a non-default TTS/LLM tweak +- Pass Criteria: The flow records or acknowledges the `mllm` mode explicitly instead of forcing the user back into the cascading default combo +- Result: ___ + +### I-11: Path B warns about private repo access + +- User Input: "I want a separate backend and frontend baseline" +- Expected Behavior: The baseline step mentions that the preferred Python repo is private and may fall back to the public decomposed sample +- Pass Criteria: The prompt does not present Path B as if it were guaranteed-public access +- Result: ___ + +### I-12: Quickstart opening uses natural wording + +- User Input: "I want to build a demo that talks to an agent. Help me implement it." +- Expected Behavior: The quickstart opening explains the "official sample first" idea in natural product language +- Pass Criteria: Does not use stiff phrasing like "run the baseline flow" or "anchor on a proven baseline"; instead says to first run the official sample through once and then customize the demo +- Result: ___ + +### I-13: Unsupported provider is stated explicitly + +- User Input: "I want to use a provider that is not in the current official provider docs" +- Expected Behavior: The quickstart flow states clearly that this provider is not in the current official support list +- Pass Criteria: Explicitly says the provider is not currently documented as supported; does not continue as if it were supported +- Result: ___ + +### I-14: Studio Agent ID path skips provider re-entry + +- User Input: "I already configured my agent in Agora Studio and I have the Agent ID" +- Expected Behavior: Quickstart switches to the Studio Agent ID branch instead of re-asking STT / LLM / TTS provider choices +- Pass Criteria: Explains the Studio Agent ID path, asks for the Agent ID or confirms the user has it, and does not reopen the default-provider prompt +- Result: ___ + +### I-15: Studio Agent ID is distinguished from runtime agent_id + +- User Input: "I have an Agent ID from Studio" +- Expected Behavior: Quickstart clarifies that the Studio Agent ID is not the same as the runtime `agent_id` returned by `/join` +- Pass Criteria: Explicitly distinguishes the Studio Agent ID from the runtime `agent_id` +- Result: ___ + +### I-16: Studio Agent ID maps to pipeline_id + +- User Input: "I already have the Agent ID from Agora Studio" +- Expected Behavior: Quickstart explains that the Studio Agent ID is passed using the request field `pipeline_id` +- Pass Criteria: Explicitly states `Agent ID` from Studio maps to `pipeline_id` in the request body +- Result: ___ + +### I-17: Studio path preserves the fixed request shape + +- User Input: "Use my Agora Studio Agent ID in the start request" +- Expected Behavior: The Studio path keeps the fixed request shape with `name`, `pipeline_id`, and `properties` +- Pass Criteria: Does not replace `pipeline_id` with `agent_id`; preserves separate header token and `properties.token` +- Result: ___ + --- ## Evaluation Log