add tool approval UI and E2E test infrastructure by bendrucker · Pull Request #16 · pydantic/ai-chat-ui

bendrucker · 2026-03-15T22:30:17Z

Upgrades to AI SDK v6
Adds tool approval UI using the AI SDK Elements Confirmation component
Adds E2E test coverage with Playwright

This is a large PR but given the tight coupling I kept it together rather than try to add E2E testing first and then upgrade to v6. If you want me to try to split it for easier review I can. The meaningful diff is ~30 files, ~1100 lines. The rest is lock files and vendored AI Elements/shadcn component upgrades.

Changes

AI SDK Upgrade

Upgrades ai from v5 to v6 and @ai-sdk/react from v2 to v3. Adds radix-ui and shiki as new dependencies. Migrates Chat.tsx to use DefaultChatTransport for request body configuration and sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithApprovalResponses for automatic tool approval flow continuation.

Tool Approval UI

Replaces custom inline approve/deny buttons in Part.tsx with the AI SDK Elements Confirmation component, scaffolded via npx shadcn@latest add @ai-elements/confirmation. The component uses React context to manage approval state and conditionally renders request, accepted, and rejected states. Adds alert.tsx as a shadcn/ui dependency.

Tool Call Rendering

Updates Part.tsx to properly handle dynamic-tool parts with toolName and custom icons via getToolIcon. Tool cards now auto-expand (defaultOpen) when in approval-requested state. Upgrades the vendored tool.tsx AI Element to include status labels and icons for all tool states including approval-requested, approval-responded, output-denied, and output-error. Adds dialog-based error display for tool errors. Adds data-tool-name attribute to tool cards for robust test selectors.

Test Infrastructure

Adds Playwright for E2E tests and Vitest for headless unit tests
Creates a deterministic Python test server (tests/server/) using pydantic-ai's FunctionModel
Adds real LLM models to test server (haiku, gpt-4.1-nano, gemini-2.0-flash)
Shared test modules organized by domain: conversation.ts, sidebar.ts, tools.ts
Playwright config driven by env vars: E2E_TEST_DIR (test directory), E2E_VIDEO (recording)

E2E Test Coverage and CI

Deterministic tests (tests/e2e/deterministic/) run on every PR against a FunctionModel test server with predictable responses. Coverage spans core messaging, model selection, conversation lifecycle (persistence, switching, deletion), tool execution (single, parallel, error recovery), and tool approval flows.

LLM tests (tests/e2e/llm/) verify real streaming from Anthropic, OpenAI, and Google. These run only via workflow_dispatch to avoid API costs, with provider API keys from repository secrets.

References

Replace custom approval buttons with AI SDK Elements Confirmation component. Add Playwright E2E tests covering tool approval accept/deny flows, chat, sidebar, model selection, and tool calls. Upgrade ai SDK from v5 to v6.

…lifecycle Add three new test suites covering gaps in E2E coverage: - error-handling: error dialog with details, recovery text - multi-tool: parallel tool completion, results, final text - conversation-lifecycle: persistence across reload, switching, active/inactive deletion Refactor test infrastructure: - Centralize expect timeout (5s) in playwright config, remove 34 inline overrides - Add data-tool-name attribute to Tool component for robust selectors - Organize helpers by domain: conversation.ts, sidebar.ts, tools.ts - Extract shared locators (toolCard, sidebar, chat) and actions (sendMessage, waitForPersisted)

- Move deterministic tests to tests/e2e/deterministic/ - Add tests/e2e/llm/ with real provider tests (anthropic, openai, google) - Add LLM models to test server (haiku, gpt-4.1-nano, gemini-2.0-flash) - Replace Playwright projects with env var config (E2E_TEST_DIR, E2E_VIDEO) - Remove placeholder e2e-llm workflow, add LLM step to main CI - Remove committed __pycache__, add to .gitignore - Document test commands in CLAUDE.md

feat: add tool approval UI and E2E test infrastructure

ab1d9e1

Replace custom approval buttons with AI SDK Elements Confirmation component. Add Playwright E2E tests covering tool approval accept/deny flows, chat, sidebar, model selection, and tool calls. Upgrade ai SDK from v5 to v6.

bendrucker changed the title ~~feat: add tool approval UI and E2E test infrastructure~~ add tool approval UI and E2E test infrastructure Mar 15, 2026

This was referenced Mar 16, 2026

Bump chat UI to v2 with SDK v6 protocol pydantic/pydantic-ai#4670

Draft

Add AI SDK E2E integration tests pydantic/pydantic-ai#4390

Closed

bendrucker force-pushed the e2e-test branch from 7ba8b92 to 80d530f Compare March 16, 2026 01:52

bendrucker marked this pull request as ready for review March 16, 2026 02:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add tool approval UI and E2E test infrastructure#16

add tool approval UI and E2E test infrastructure#16
bendrucker wants to merge 3 commits intopydantic:mainfrom
bendrucker:e2e-test

bendrucker commented Mar 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

bendrucker commented Mar 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

AI SDK Upgrade

Tool Approval UI

Tool Call Rendering

Test Infrastructure

E2E Test Coverage and CI

References

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bendrucker commented Mar 15, 2026 •

edited

Loading