microsoft · MaanavD · Apr 28, 2026 · Apr 29, 2026 · May 1, 2026 · May 1, 2026
diff --git a/samples/README.md b/samples/README.md
@@ -11,4 +11,4 @@ Explore complete working examples that demonstrate how to use Foundry Local —
 | [**C#**](cs/) | 13 | .NET SDK samples including native chat, embeddings, audio transcription, tool calling, model management, web server, and tutorials. Uses WinML on Windows for hardware acceleration. |
 | [**JavaScript**](js/) | 13 | Node.js SDK samples including native chat, embeddings, audio transcription, Electron desktop app, Copilot SDK integration, LangChain, tool calling, web server, and tutorials. |
 | [**Python**](python/) | 10 | Python samples using the OpenAI-compatible API, including chat, embeddings, audio transcription, LangChain integration, tool calling, web server, and tutorials. |
-| [**Rust**](rust/) | 9 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, and tutorials. |
+| [**Rust**](rust/) | 10 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, Responses API, and tutorials. |
diff --git a/samples/rust/Cargo.toml b/samples/rust/Cargo.toml
@@ -1,6 +1,7 @@
 [workspace]
 members = [
     "foundry-local-webserver",
+    "web-server-responses",
     "tool-calling-foundry-local",
     "native-chat-completions",
     "audio-transcription-example",

diff --git a/samples/rust/README.md b/samples/rust/README.md
@@ -14,6 +14,7 @@ These samples demonstrate how to use the Rust binding for Foundry Local.
 | [embeddings](embeddings/) | Generate single and batch text embeddings using the native embedding client. |
 | [audio-transcription-example](audio-transcription-example/) | Audio transcription (non-streaming and streaming) using the Whisper model. |
 | [foundry-local-webserver](foundry-local-webserver/) | Start a local OpenAI-compatible web server and call it with a standard HTTP client. |
+| [web-server-responses](web-server-responses/) | Call a running local OpenAI-compatible web server with the Responses API, including streaming and tool calling. |
 | [tool-calling-foundry-local](tool-calling-foundry-local/) | Tool calling with streaming responses, multi-turn conversation, and local tool execution. |
 | [tutorial-chat-assistant](tutorial-chat-assistant/) | Build an interactive multi-turn chat assistant (tutorial). |
 | [tutorial-document-summarizer](tutorial-document-summarizer/) | Summarize documents with AI (tutorial). |

diff --git a/samples/rust/web-server-responses/Cargo.toml b/samples/rust/web-server-responses/Cargo.toml
@@ -0,0 +1,14 @@
+[package]
+name = "web-server-responses"
+version = "0.1.0"
+edition = "2021"
+description = "Responses API sample using the Foundry Local OpenAI-compatible web service"
+
+[dependencies]
+foundry-local-sdk = { path = "../../../sdk/rust" }
+tokio = { version = "1", features = ["rt-multi-thread", "macros"] }
+serde_json = "1"
+reqwest = { version = "0.12", features = ["json", "stream"] }
+
+[target.'cfg(windows)'.dependencies]
+foundry-local-sdk = { path = "../../../sdk/rust", features = ["winml"] }
diff --git a/samples/rust/web-server-responses/README.md b/samples/rust/web-server-responses/README.md
@@ -0,0 +1,67 @@
+# Responses API web-service sample
+
+This sample starts the Foundry Local OpenAI-compatible web service with the Rust SDK, then calls the Responses API through raw HTTP requests to `/v1/responses`.
+
+It demonstrates:
+
+- Non-streaming Responses API calls
+- Streaming Server-Sent Events (SSE) responses
+- Function/tool calling with `previous_response_id`
+- Local model load/unload and web-service cleanup
+
+## Prerequisites
+
+- Rust 1.70 or later
+- Foundry Local runtime prerequisites for your platform
+- Internet access the first time dependencies, execution providers, or the sample model need to be downloaded
+
+No OpenAI API key is required. The sample talks to the local Foundry Local web service.
+
+## What gets installed
+
+Cargo restores the Rust crates declared in `Cargo.toml`:
+
+| Dependency | Purpose |
+|------------|---------|
+| `foundry-local-sdk` | Initializes Foundry Local, downloads/registers execution providers, manages the model, and starts/stops the local web service. |
+| `tokio` | Runs the async sample. |
+| `reqwest` | Sends JSON requests and reads streaming SSE chunks from `/v1/responses`. |
+| `serde_json` | Builds request payloads and reads response JSON. |
+
+On Windows, the sample enables the SDK `winml` feature through the target-specific dependency in `Cargo.toml`.
+
+At runtime, the sample also:
+
+- Downloads and registers Foundry Local execution providers if needed.
+- Downloads `qwen2.5-0.5b` if it is not already cached.
+- Starts the local OpenAI-compatible web service and uses the dynamic URL returned by the SDK.
+
+Downloaded models, native runtime files, and Cargo build outputs are local machine artifacts and should not be committed.
+
+## Run the sample
+
+From the Rust samples workspace:
+
+```powershell
+cd samples\rust
+cargo run -p web-server-responses
+```
+
+Or from this sample directory:
+
+```powershell
+cd samples\rust\web-server-responses
+cargo run
+```
+
+The sample prints progress for execution-provider/model setup, then runs:
+
+1. A non-streaming Responses request.
+2. A streaming Responses request that consumes `response.output_text.delta` events.
+3. A function-calling request that asks the model to call `get_weather`, submits a `function_call_output`, and prints the final assistant response.
+
+## Troubleshooting
+
+If setup fails while resolving native Foundry Local symbols, verify that your locally installed Foundry Local runtime packages are compatible with the SDK version in this repository.
+
+If model download is unavailable, pre-cache `qwen2.5-0.5b` with your normal Foundry Local workflow, then run the sample again.