Conversation
Implements the OpenAI Responses API client for the Foundry Local Python SDK. New files: - src/openai/responses_types.py: full type system (content parts, items, tools, config, ResponseObject with output_text property), all streaming event dataclasses, parse_streaming_event factory, and _to_dict serializer - src/openai/responses_client.py: HTTP-only sync client (ResponsesClient, ResponsesClientSettings, ResponsesAPIError) with create, create_streaming (SSE generator), get, delete, cancel, get_input_items, list - examples/responses.py: 5 end-to-end scenarios (basic, streaming, multi-turn, tool calling, vision) - test/openai/test_responses_client.py: 56 unit tests (mocked HTTP) - test/openai/test_responses_integration.py: 14 integration tests gated on FOUNDRY_INTEGRATION_TESTS=1 Modified files: - src/foundry_local_manager.py: create_responses_client factory method - src/imodel.py: abstract create_responses_client - src/detail/model.py: delegating create_responses_client - src/detail/model_variant.py: concrete create_responses_client - src/openai/__init__.py: export ResponsesClient, ResponsesClientSettings, ResponsesAPIError - src/__init__.py: export all public Responses API types Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Pull request overview
Adds an OpenAI-compatible Responses API surface to the Foundry Local Python SDK, implemented as an HTTP client against the embedded web service and wired through the manager/model abstractions.
Changes:
- Introduces
ResponsesClient(+ settings + error type) with non-streaming and SSE-streaming support. - Adds a full Responses DTO/type system (content parts, items, tools/config, response object helpers, streaming event parsing/serialization).
- Adds examples and both mocked unit tests and gated live integration tests; wires factory methods through
FoundryLocalManagerandIModelimplementations.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/python/src/openai/responses_client.py | New HTTP client for /v1/responses, including SSE parsing helpers. |
| sdk/python/src/openai/responses_types.py | New dataclass-based type system + parsers/serializers for Responses API. |
| sdk/python/src/openai/init.py | Exports Responses client types from foundry_local_sdk.openai. |
| sdk/python/src/foundry_local_manager.py | Adds create_responses_client() factory bound to running web service. |
| sdk/python/src/imodel.py | Extends IModel with create_responses_client(base_url) API. |
| sdk/python/src/detail/model.py | Wires Model.create_responses_client() through the selected variant. |
| sdk/python/src/detail/model_variant.py | Implements create_responses_client() for a specific variant. |
| sdk/python/src/init.py | Exposes Responses client + public DTOs at package root. |
| sdk/python/examples/responses.py | End-to-end example covering create, streaming, tool calls, vision. |
| sdk/python/test/openai/test_responses_client.py | Mocked unit tests for request building, SSE parsing, error handling, serialization. |
| sdk/python/test/openai/test_responses_integration.py | Gated live integration tests against a real runtime/web service. |
| sdk/python/requirements.txt | Updates pinned native/runtime dependencies and platform markers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Add configurable timeout to ResponsesClientSettings (default 60s); non-streaming calls use it directly, streaming uses it as connect timeout with unbounded read (suitable for long responses) - Fix SSE buffer: replace O(n) list-join-per-chunk with a single string buffer and split on double-newline; use chunk_size=None for natural server chunk boundaries - Add InputImageContent.__post_init__ to enforce exactly one of image_url or image_data (raises ValueError if both or neither) - Add optional max_size=(w,h) to InputImageContent.from_file and from_bytes to resize images before base64-encoding (requires Pillow) - Raise ValueError for unknown content-part types instead of silently returning a fallback InputTextContent - Document _MAX_ID_LEN=256 with rationale; lower from 1024 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…urns None, example usage guard - ResponsesClientSettings.store defaults to None (omitted from request body, server decides) — aligns with JS SDK which has store?: boolean - _MAX_ID_LEN reverted to 1024 to align with JS SDK constant - _parse_content_part returns None for unknown types (forward-compat, not ValueError); _parse_content filters out None entries - examples/responses.py: guard event.response.usage chain with getattr to avoid AttributeError if response or usage is absent - Tests updated: store default tests, too-long-id threshold (1025), request body assertions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…onses-api-python-sdk
- Rename model and variant Responses client factories to get_responses_client to match existing get_*_client naming. - Use FoundryLocalException for Responses API transport and parsing errors instead of exporting a dedicated ResponsesAPIError. - Keep only the foundry-local-core version bump in requirements.txt and restore existing ORT dependency markers/order. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the SDK-native Responses client implementation with a focused Python sample and integration tests that use FoundryLocalManager for setup/model/server lifecycle and the official OpenAI Python client for /v1/responses calls. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…onses-api-python-sdk
|
Pivoted this PR to the web-service sample/test approach:
Local validation: |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
Implements the OpenAI Responses API client for the Foundry Local Python SDK. HTTP-only, sync pattern matching the existing
chat_client.pystyle.New files
src/openai/responses_types.pyResponseObject(withoutput_textproperty), all streaming event dataclasses,parse_streaming_eventfactory,_to_dictserializersrc/openai/responses_client.pyResponsesClient,ResponsesClientSettings,ResponsesAPIError. Methods:create,create_streaming(SSE generator),get,delete,cancel,get_input_items,listexamples/responses.pytest/openai/test_responses_client.pytest/openai/test_responses_integration.pyFOUNDRY_INTEGRATION_TESTS=1)Modified files
foundry_local_manager.py—create_responses_client(model_id)factoryimodel.py/detail/model.py/detail/model_variant.py— factory wired through the model hierarchysrc/__init__.py/src/openai/__init__.py— all new public types exportedTest results
qwen2.5-0.5bserverRelated
Closes #505 (the earlier C# Responses API PR predates this but covers a different SDK)