Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion samples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,4 @@ Explore complete working examples that demonstrate how to use Foundry Local —
| [**C#**](cs/) | 13 | .NET SDK samples including native chat, embeddings, audio transcription, tool calling, model management, web server, and tutorials. Uses WinML on Windows for hardware acceleration. |
| [**JavaScript**](js/) | 13 | Node.js SDK samples including native chat, embeddings, audio transcription, Electron desktop app, Copilot SDK integration, LangChain, tool calling, web server, and tutorials. |
| [**Python**](python/) | 10 | Python samples using the OpenAI-compatible API, including chat, embeddings, audio transcription, LangChain integration, tool calling, web server, and tutorials. |
| [**Rust**](rust/) | 9 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, and tutorials. |
| [**Rust**](rust/) | 10 | Rust SDK samples including native chat, embeddings, audio transcription, tool calling, web server, Responses API, and tutorials. |
1 change: 1 addition & 0 deletions samples/rust/Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
[workspace]
members = [
"foundry-local-webserver",
"web-server-responses",
"tool-calling-foundry-local",
"native-chat-completions",
"audio-transcription-example",
Expand Down
1 change: 1 addition & 0 deletions samples/rust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ These samples demonstrate how to use the Rust binding for Foundry Local.
| [embeddings](embeddings/) | Generate single and batch text embeddings using the native embedding client. |
| [audio-transcription-example](audio-transcription-example/) | Audio transcription (non-streaming and streaming) using the Whisper model. |
| [foundry-local-webserver](foundry-local-webserver/) | Start a local OpenAI-compatible web server and call it with a standard HTTP client. |
| [web-server-responses](web-server-responses/) | Call a running local OpenAI-compatible web server with the Responses API, including streaming and tool calling. |
| [tool-calling-foundry-local](tool-calling-foundry-local/) | Tool calling with streaming responses, multi-turn conversation, and local tool execution. |
| [tutorial-chat-assistant](tutorial-chat-assistant/) | Build an interactive multi-turn chat assistant (tutorial). |
| [tutorial-document-summarizer](tutorial-document-summarizer/) | Summarize documents with AI (tutorial). |
Expand Down
14 changes: 14 additions & 0 deletions samples/rust/web-server-responses/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
[package]
name = "web-server-responses"
version = "0.1.0"
edition = "2021"
description = "Responses API sample using the Foundry Local OpenAI-compatible web service"

[dependencies]
foundry-local-sdk = { path = "../../../sdk/rust" }
tokio = { version = "1", features = ["rt-multi-thread", "macros"] }
serde_json = "1"
reqwest = { version = "0.12", features = ["json", "stream"] }

[target.'cfg(windows)'.dependencies]
foundry-local-sdk = { path = "../../../sdk/rust", features = ["winml"] }
67 changes: 67 additions & 0 deletions samples/rust/web-server-responses/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Responses API web-service sample

This sample starts the Foundry Local OpenAI-compatible web service with the Rust SDK, then calls the Responses API through raw HTTP requests to `/v1/responses`.

It demonstrates:

- Non-streaming Responses API calls
- Streaming Server-Sent Events (SSE) responses
- Function/tool calling with `previous_response_id`
- Local model load/unload and web-service cleanup

## Prerequisites

- Rust 1.70 or later
- Foundry Local runtime prerequisites for your platform
- Internet access the first time dependencies, execution providers, or the sample model need to be downloaded

No OpenAI API key is required. The sample talks to the local Foundry Local web service.

## What gets installed

Cargo restores the Rust crates declared in `Cargo.toml`:

| Dependency | Purpose |
|------------|---------|
| `foundry-local-sdk` | Initializes Foundry Local, downloads/registers execution providers, manages the model, and starts/stops the local web service. |
| `tokio` | Runs the async sample. |
| `reqwest` | Sends JSON requests and reads streaming SSE chunks from `/v1/responses`. |
| `serde_json` | Builds request payloads and reads response JSON. |

On Windows, the sample enables the SDK `winml` feature through the target-specific dependency in `Cargo.toml`.

At runtime, the sample also:

- Downloads and registers Foundry Local execution providers if needed.
- Downloads `qwen2.5-0.5b` if it is not already cached.
- Starts the local OpenAI-compatible web service and uses the dynamic URL returned by the SDK.

Downloaded models, native runtime files, and Cargo build outputs are local machine artifacts and should not be committed.

## Run the sample

From the Rust samples workspace:

```powershell
cd samples\rust
cargo run -p web-server-responses
```

Or from this sample directory:

```powershell
cd samples\rust\web-server-responses
cargo run
```

The sample prints progress for execution-provider/model setup, then runs:

1. A non-streaming Responses request.
2. A streaming Responses request that consumes `response.output_text.delta` events.
3. A function-calling request that asks the model to call `get_weather`, submits a `function_call_output`, and prints the final assistant response.

## Troubleshooting

If setup fails while resolving native Foundry Local symbols, verify that your locally installed Foundry Local runtime packages are compatible with the SDK version in this repository.

If model download is unavailable, pre-cache `qwen2.5-0.5b` with your normal Foundry Local workflow, then run the sample again.
Loading
Loading