Skip to content

feat(sdk/python): add Responses API client#670

Open
MaanavD wants to merge 9 commits intomainfrom
agents/implement-responses-api-python-sdk
Open

feat(sdk/python): add Responses API client#670
MaanavD wants to merge 9 commits intomainfrom
agents/implement-responses-api-python-sdk

Conversation

@MaanavD
Copy link
Copy Markdown
Collaborator

@MaanavD MaanavD commented Apr 23, 2026

Summary

Implements the OpenAI Responses API client for the Foundry Local Python SDK. HTTP-only, sync pattern matching the existing chat_client.py style.

New files

File Description
src/openai/responses_types.py Full type system: content parts, response items, tools, config, ResponseObject (with output_text property), all streaming event dataclasses, parse_streaming_event factory, _to_dict serializer
src/openai/responses_client.py HTTP client: ResponsesClient, ResponsesClientSettings, ResponsesAPIError. Methods: create, create_streaming (SSE generator), get, delete, cancel, get_input_items, list
examples/responses.py 5 end-to-end scenarios: basic create, streaming, multi-turn, tool calling, vision
test/openai/test_responses_client.py 56 unit tests with mocked HTTP
test/openai/test_responses_integration.py 14 integration tests (gated on FOUNDRY_INTEGRATION_TESTS=1)

Modified files

  • foundry_local_manager.pycreate_responses_client(model_id) factory
  • imodel.py / detail/model.py / detail/model_variant.py — factory wired through the model hierarchy
  • src/__init__.py / src/openai/__init__.py — all new public types exported

Test results

  • Unit tests: 56/56 passing (no server needed)
  • Integration tests: 14/14 passing against live qwen2.5-0.5b server

Related

Closes #505 (the earlier C# Responses API PR predates this but covers a different SDK)

Implements the OpenAI Responses API client for the Foundry Local Python SDK.

New files:
- src/openai/responses_types.py: full type system (content parts, items, tools,
  config, ResponseObject with output_text property), all streaming event
  dataclasses, parse_streaming_event factory, and _to_dict serializer
- src/openai/responses_client.py: HTTP-only sync client (ResponsesClient,
  ResponsesClientSettings, ResponsesAPIError) with create, create_streaming
  (SSE generator), get, delete, cancel, get_input_items, list
- examples/responses.py: 5 end-to-end scenarios (basic, streaming, multi-turn,
  tool calling, vision)
- test/openai/test_responses_client.py: 56 unit tests (mocked HTTP)
- test/openai/test_responses_integration.py: 14 integration tests gated on
  FOUNDRY_INTEGRATION_TESTS=1

Modified files:
- src/foundry_local_manager.py: create_responses_client factory method
- src/imodel.py: abstract create_responses_client
- src/detail/model.py: delegating create_responses_client
- src/detail/model_variant.py: concrete create_responses_client
- src/openai/__init__.py: export ResponsesClient, ResponsesClientSettings, ResponsesAPIError
- src/__init__.py: export all public Responses API types

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 23, 2026 16:45
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 23, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment May 1, 2026 9:57pm

Request Review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an OpenAI-compatible Responses API surface to the Foundry Local Python SDK, implemented as an HTTP client against the embedded web service and wired through the manager/model abstractions.

Changes:

  • Introduces ResponsesClient (+ settings + error type) with non-streaming and SSE-streaming support.
  • Adds a full Responses DTO/type system (content parts, items, tools/config, response object helpers, streaming event parsing/serialization).
  • Adds examples and both mocked unit tests and gated live integration tests; wires factory methods through FoundryLocalManager and IModel implementations.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
sdk/python/src/openai/responses_client.py New HTTP client for /v1/responses, including SSE parsing helpers.
sdk/python/src/openai/responses_types.py New dataclass-based type system + parsers/serializers for Responses API.
sdk/python/src/openai/init.py Exports Responses client types from foundry_local_sdk.openai.
sdk/python/src/foundry_local_manager.py Adds create_responses_client() factory bound to running web service.
sdk/python/src/imodel.py Extends IModel with create_responses_client(base_url) API.
sdk/python/src/detail/model.py Wires Model.create_responses_client() through the selected variant.
sdk/python/src/detail/model_variant.py Implements create_responses_client() for a specific variant.
sdk/python/src/init.py Exposes Responses client + public DTOs at package root.
sdk/python/examples/responses.py End-to-end example covering create, streaming, tool calls, vision.
sdk/python/test/openai/test_responses_client.py Mocked unit tests for request building, SSE parsing, error handling, serialization.
sdk/python/test/openai/test_responses_integration.py Gated live integration tests against a real runtime/web service.
sdk/python/requirements.txt Updates pinned native/runtime dependencies and platform markers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sdk/python/src/openai/responses_client.py Outdated
Comment thread sdk/python/src/openai/responses_client.py Outdated
Comment thread sdk/python/src/openai/responses_types.py Outdated
Comment thread sdk/python/src/openai/responses_types.py Outdated
Comment thread sdk/python/src/openai/responses_client.py Outdated
Comment thread sdk/python/src/openai/responses_client.py Outdated
Comment thread sdk/python/src/openai/responses_types.py Outdated
- Add configurable timeout to ResponsesClientSettings (default 60s);
  non-streaming calls use it directly, streaming uses it as connect
  timeout with unbounded read (suitable for long responses)
- Fix SSE buffer: replace O(n) list-join-per-chunk with a single string
  buffer and split on double-newline; use chunk_size=None for natural
  server chunk boundaries
- Add InputImageContent.__post_init__ to enforce exactly one of
  image_url or image_data (raises ValueError if both or neither)
- Add optional max_size=(w,h) to InputImageContent.from_file and
  from_bytes to resize images before base64-encoding (requires Pillow)
- Raise ValueError for unknown content-part types instead of silently
  returning a fallback InputTextContent
- Document _MAX_ID_LEN=256 with rationale; lower from 1024

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sdk/python/src/openai/responses_client.py Outdated
Comment thread sdk/python/src/openai/responses_client.py Outdated
Comment thread sdk/python/examples/responses.py Outdated
Comment thread sdk/python/src/openai/responses_types.py Outdated
…urns None, example usage guard

- ResponsesClientSettings.store defaults to None (omitted from request body, server
  decides) — aligns with JS SDK which has store?: boolean
- _MAX_ID_LEN reverted to 1024 to align with JS SDK constant
- _parse_content_part returns None for unknown types (forward-compat, not ValueError);
  _parse_content filters out None entries
- examples/responses.py: guard event.response.usage chain with getattr to avoid
  AttributeError if response or usage is absent
- Tests updated: store default tests, too-long-id threshold (1025), request body assertions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread sdk/python/src/detail/model.py Outdated
Comment thread sdk/python/src/detail/model_variant.py Outdated
Comment thread sdk/python/src/openai/__init__.py Outdated
Comment thread sdk/python/requirements.txt Outdated
- Rename model and variant Responses client factories to get_responses_client
  to match existing get_*_client naming.
- Use FoundryLocalException for Responses API transport and parsing errors
  instead of exporting a dedicated ResponsesAPIError.
- Keep only the foundry-local-core version bump in requirements.txt and
  restore existing ORT dependency markers/order.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the SDK-native Responses client implementation with a focused Python
sample and integration tests that use FoundryLocalManager for setup/model/server
lifecycle and the official OpenAI Python client for /v1/responses calls.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@MaanavD
Copy link
Copy Markdown
Collaborator Author

MaanavD commented May 1, 2026

Pivoted this PR to the web-service sample/test approach:

  • Removed the SDK-native Python Responses client/types and manager/model Responses APIs from this branch.
  • Added sdk/python/examples/responses_web_service.py, which uses FoundryLocalManager for setup/model/server lifecycle and the official OpenAI Python client for /v1/responses.
  • Added sdk/python/test/openai/test_responses_web_service.py covering non-streaming, streaming, and function/tool-call round trip via the local web service.
  • Updated the Python README examples list.
  • Merged latest origin/main so the PR diff is focused on the sample/test changes.

Local validation: python -m pytest test\openai\test_responses_web_service.py --no-header -q => 3 passed (with the existing pytest timeout config warning).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants