Skip to content
Merged
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
3981cea
Add LangSmith tracing plugin for Temporal workflows
xumaple Mar 17, 2026
df9a55c
Refactor LangSmith interceptor: add ReplaySafeRunTree, reduce boilerp…
xumaple Mar 17, 2026
5d25abc
Fix import sorting and extract _get_current_run_safe helper
xumaple Mar 17, 2026
601d67a
Add Nexus integration test coverage
xumaple Mar 17, 2026
cdb5886
Apply ruff formatting to all langsmith files
xumaple Mar 17, 2026
941637f
Fix pydocstyle, pyright, and mypy lint errors
xumaple Mar 17, 2026
2fa9571
Fix basedpyright errors and add CLAUDE.md with CI lint docs
xumaple Mar 17, 2026
7623d43
Fix all basedpyright warnings (deprecated imports, unused params)
xumaple Mar 17, 2026
a3c0bee
Clean up unused env params: use type:ignore consistently
xumaple Mar 17, 2026
982d220
Address PR review feedback: defaults, naming, and header key
xumaple Mar 18, 2026
ad67096
Add replay safety and worker restart tests for LangSmith plugin
xumaple Mar 19, 2026
197a2d3
Implement background thread I/O for LangSmith workflow tracing
xumaple Mar 25, 2026
96c139b
Replace unnecessary Any type annotations with specific types
xumaple Mar 25, 2026
d1e66c4
Fix basedpyright warnings in test files
xumaple Mar 25, 2026
ee09c1f
Clean up types, dead code, and test assertions
xumaple Mar 25, 2026
4e70ef8
Fix formatting in test_integration.py
xumaple Mar 25, 2026
2b84421
Add @traceable to all activity definitions in integration tests
xumaple Mar 25, 2026
ded720b
tests
xumaple Mar 27, 2026
6ff9cc9
Fix context propagation bugs and remove handler suppression
xumaple Mar 27, 2026
d9fb85a
Skip LangSmith tracing for built-in Temporal queries
xumaple Mar 30, 2026
d271aba
Remove dead error gate from _safe_aio_to_thread
xumaple Mar 30, 2026
4f5d040
Fix pydoctor cross-refs and mock collector trace duplication
xumaple Mar 30, 2026
54d47a9
Address PR review feedback: comments, end() determinism, yield simpli…
xumaple Apr 1, 2026
c37bac8
Create per-worker LangSmith interceptors instead of sharing one acros…
xumaple Apr 3, 2026
a232d16
Remove unnecessary sandbox_unrestricted from post/patch in _ReplaySaf…
xumaple Apr 3, 2026
5bdc3f4
Rename _ContextBridgeRunTree to _RootReplaySafeRunTreeFactory
xumaple Apr 3, 2026
c6db234
Rename overloaded kwargs/ctx_kwargs variables in LangSmith interceptor
xumaple Apr 3, 2026
0fc2ab3
Clean up parent post-processing in LangSmith interceptor
xumaple Apr 3, 2026
2853299
Make StartFoo and RunFoo siblings instead of parent-child in LangSmit…
xumaple Apr 3, 2026
1b38567
Add README for LangSmith plugin
xumaple Apr 3, 2026
7d8c19a
Share one langsmith.Client across all interceptors
xumaple Apr 6, 2026
c2490b0
Add langsmith optional dependency and install instructions
xumaple Apr 6, 2026
5ca84e7
Delete duplicate test_constructor_requires_executor test
xumaple Apr 7, 2026
75115a5
Revert to single shared LangSmithInterceptor
xumaple Apr 7, 2026
cf01699
Pin langsmith dependency to 0.7.x
xumaple Apr 7, 2026
1928927
Improve README and rename plugin params to match interceptor API
xumaple Apr 7, 2026
71e24f6
Improve README and rename plugin params to match interceptor API
xumaple Apr 7, 2026
3ad85bd
Consolidate SimpleNexusWorkflow into TraceableActivityWorkflow
xumaple Apr 7, 2026
c456106
Consolidate test workflows and verify ValidateUpdate elision
xumaple Apr 7, 2026
5ac12f0
Extract find_traces helper for test trace filtering
xumaple Apr 7, 2026
a81e8d6
Add screenshots and polish README examples
xumaple Apr 7, 2026
22d9838
Merge remote-tracking branch 'origin/main' into maplexu/langsmith-plugin
xumaple Apr 8, 2026
d119c94
Use uv add in README install instructions
xumaple Apr 8, 2026
491b847
Merge branch 'main' into maplexu/langsmith-plugin
xumaple Apr 8, 2026
4027019
Simplify _poll_query to rely on pytest timeout
xumaple Apr 8, 2026
fd6e4d6
Merge branch 'maplexu/langsmith-plugin' of github.com:temporalio/sdk-…
xumaple Apr 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ temporalio/bridge/temporal_sdk_bridge*
/tests/helpers/golangworker/golangworker
/.idea
/sdk-python.iml
**/CLAUDE.md
/.zed
*.DS_Store
tags
Expand Down
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ opentelemetry = ["opentelemetry-api>=1.11.1,<2", "opentelemetry-sdk>=1.11.1,<2"]
pydantic = ["pydantic>=2.0.0,<3"]
openai-agents = ["openai-agents>=0.3,<0.7", "mcp>=1.9.4, <2"]
google-adk = ["google-adk>=1.27.0,<2"]
langsmith = ["langsmith>=0.7.0"]
aioboto3 = [
"aioboto3>=10.4.0",
"types-aioboto3[s3]>=10.4.0",
Expand Down Expand Up @@ -69,6 +70,7 @@ dev = [
"googleapis-common-protos==1.70.0",
"pytest-rerunfailures>=16.1",
"moto[s3,server]>=5",
"langsmith>=0.7.0",
]

[tool.poe.tasks]
Expand Down
231 changes: 231 additions & 0 deletions temporalio/contrib/langsmith/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,231 @@
# LangSmith Plugin for Temporal Python SDK

This Temporal [plugin](https://docs.temporal.io/develop/plugins-guide) allows your [LangSmith](https://smith.langchain.com/) traces to be fully replay safe when added to Temporal workflows and activities. It propagates trace context across worker boundaries so that `@traceable` calls, LLM invocations, and Temporal operations show up in a single connected trace, and ensures that replaying does not generate duplicate traces.

## Installation

```
pip install temporalio[langsmith]
```

## Quick Start

Register the plugin on your Temporal client. You need it on both the client (starter) side and the workers:

```python
from temporalio.client import Client
from temporalio.contrib.langsmith import LangSmithPlugin

client = await Client.connect(
"localhost:7233",
plugins=[LangSmithPlugin(project_name="my-project")],
)
```

Once that's set up, any `@traceable` function inside your workflows and activities will show up in LangSmith with correct parent-child relationships, even across worker boundaries.

## Example: AI Chatbot

A conversational chatbot using OpenAI, orchestrated by a Temporal workflow. The workflow stays alive waiting for user messages via signals, and dispatches each message to an activity that calls the LLM.

### Activity (wraps the LLM call)

```python
@langsmith.traceable(name="Call OpenAI", run_type="chain")
@activity.defn
async def call_openai(request: OpenAIRequest) -> Response:
client = wrap_openai(AsyncOpenAI()) # This is a traced langsmith function
return await client.responses.create(
model=request.model,
input=request.input,
instructions=request.instructions,
)
```

### Workflow (orchestrates the conversation)

```python
@workflow.defn
class ChatbotWorkflow:
@workflow.run
async def run(self) -> str:
# @traceable works inside workflows — fully replay-safe
now = workflow.now().strftime("%b %d %H:%M")
return await langsmith.traceable(
name=f"Session {now}", run_type="chain",
)(self._session)()

async def _session(self) -> str:
while not self._done:
await workflow.wait_condition(
lambda: self._pending_message is not None or self._done
)
if self._done:
break

message = self._pending_message
self._pending_message = None

@langsmith.traceable(name=f"Request: {message[:60]}", run_type="chain")
async def _query(msg: str) -> str:
response = await workflow.execute_activity(
call_openai,
OpenAIRequest(model="gpt-4o-mini", input=msg),
start_to_close_timeout=timedelta(seconds=60),
)
return response.output_text

self._last_response = await _query(message)

return "Session ended."
```

### Worker

```python
client = await Client.connect(
"localhost:7233",
plugins=[LangSmithPlugin(project_name="chatbot")],
)

worker = Worker(
client,
task_queue="chatbot",
workflows=[ChatbotWorkflow],
activities=[call_openai],
)
await worker.run()
```

### What you see in LangSmith

With the default configuration (`add_temporal_runs=False`), the trace only contains your application logic:

```
Session Apr 03 14:30
Request: "What's the weather in NYC?"
Call OpenAI
openai.responses.create (auto-traced by wrap_openai)
```

<!-- Screenshot: LangSmith trace tree with add_temporal_runs=False showing clean application-only hierarchy -->

## `add_temporal_runs` — Temporal Operation Visibility

By default, `add_temporal_runs` is `False` and only your `@traceable` application logic appears in traces. Setting it to `True` also adds Temporal operations (StartWorkflow, RunWorkflow, StartActivity, RunActivity, etc.):

```python
plugins=[LangSmithPlugin(project_name="my-project", add_temporal_runs=True)]
```

This adds Temporal operation nodes to the trace tree so that the orchestration layer is visible alongside your application logic. If the caller wraps `start_workflow` in a `@traceable` function, the full trace looks like:

```
Ask Chatbot # @traceable wrapper around client.start_workflow
StartWorkflow:ChatbotWorkflow
RunWorkflow:ChatbotWorkflow
Session Apr 03 14:30
Request: "What's the weather in NYC?"
StartActivity:call_openai
RunActivity:call_openai
Call OpenAI
openai.responses.create
```

Note: `StartFoo` and `RunFoo` appear as siblings. The start is the short-lived outbound RPC that completes immediately, and the run is the actual execution which may take much longer.

<!-- Screenshot: LangSmith trace tree with add_temporal_runs=True showing Temporal operation nodes -->

<!-- Screenshot: Temporal UI showing the corresponding workflow execution -->

## Migrating Existing LangSmith Code to Temporal

If you already have code with LangSmith tracing, you should be able to move it into a Temporal workflow and keep the same trace hierarchy. The plugin handles sandbox restrictions and context propagation behind the scenes, so anything that was traceable before should remain traceable after the move. More details below:

### Where `@traceable` works

The plugin allows `@traceable` to work inside Temporal's deterministic workflow sandbox, where it normally can't run:

| Location | Works? | Notes |
|----------|--------|-------|
| On `@activity.defn` functions | Yes | Stack `@traceable` on top of `@activity.defn` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So from what I read below, this fires on each retry? This could be noting briefly because I hadn't thought of that distinction. And a user who doesn't want that could use the wrapping approach you mention elsewhere.

Also the code I read uses @langsmith.traceable -- should we say that instead? Your table here uses the qualifier inconsistently, which made me briefly wonder if there was some important distinction.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed up the table and linked the section, hopefully makes more sense now

| On `@workflow.defn` class | No | Use `@traceable` inside `@workflow.run` instead |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect it to work on @workflow.run, not on @workflow.defn

  • Why doesn't it work there?
  • What happens if the user tries to put it there?

Copy link
Copy Markdown
Contributor Author

@xumaple xumaple Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Why doesn't it work there?

    It doesn't work there because it's not within the hot path where interceptors can properly propagate the context and inject replay safety guardrails.

  • What hapepns if the user tries to put it there?

    Traces would be emitted with missing parent context tracing (and hence probably show up as root traces), and would be duplicated on replay.

| Inside workflow methods (sync or async) | Yes | Use `langsmith.traceable(name="...")(fn)()` |
| Inside activity methods (sync or async) | Yes | Regular `@traceable` decorator |
| Around `client.start_workflow` / `execute_workflow` | Yes | Wrap the caller to trace the entire workflow as one unit |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about logging signals and updates?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Around the signal/query definitions wouldn't work, no.

| Around `execute_activity` calls | Yes | Wrap the dispatch to group related operations |

## Replay Safety

Temporal workflows are deterministic and get replayed from event history on recovery. The plugin accounts for this by injecting replay-safe data into your traceable runs:

- **No duplicate traces on replay.** Run IDs are derived deterministically from the workflow's random seed, so replayed operations produce the same IDs and LangSmith deduplicates them.
- **No non-deterministic calls.** The plugin injects metadata using `workflow.now()` for timestamps and `workflow.random()` for UUIDs instead of `datetime.now()` and `uuid4()`.
- **Background I/O stays outside the sandbox.** LangSmith HTTP calls to the server are submitted to a background thread pool that doesn't interfere with the deterministic workflow execution.

You don't need to do anything special for this. Your `@traceable` functions behave the same whether it's a fresh execution or a replay.

### Example: Worker crash mid-workflow

```
1. Workflow starts, executes Activity A -> trace appears in LangSmith
2. Worker crashes
3. New worker picks up the workflow
4. Workflow replays Activity A (skips execution) -> NO duplicate trace
5. Workflow executes Activity B (new work) -> new trace appears
```

<!-- Screenshot: LangSmith showing a workflow trace that survived a worker restart with no duplicate runs -->

### Example: Wrapping retriable steps in a trace

Since Temporal retries failed activities, you can use `@traceable` to group the attempts together:

```python
@langsmith.traceable(name="my_step", run_type="chain")
async def my_step(message: str) -> str:
return await workflow.execute_activity(
call_openai,
...
)
```

This groups everything under one run:
```
my_step
Call OpenAI # first attempt
openai.responses.create
Call OpenAI # retry
openai.responses.create
```

## Context Propagation

The plugin propagates trace context across process boundaries (client -> workflow -> activity -> child workflow -> nexus) via Temporal headers. You don't need to pass any context manually.

```
Client Process Worker Process (Workflow) Worker Process (Activity)
───────────── ────────────────────────── ─────────────────────────
@traceable("my workflow")
start_workflow ──headers──> RunWorkflow
@traceable("session")
execute_activity ──headers──> RunActivity
@traceable("Call OpenAI")
openai.create(...)
```

## API Reference

### `LangSmithPlugin`

```python
LangSmithPlugin(
client=None, # langsmith.Client instance (auto-created if None)
project_name=None, # LangSmith project name
add_temporal_runs=False, # Show Temporal operation nodes in traces
metadata=None, # Default metadata for all runs
tags=None, # Default tags for all runs
)
```

We recommend registering the plugin on both the client and all workers. Strictly speaking, you only need it on the sides that produce traces, but adding it everywhere avoids surprises with context propagation. The client and worker don't need to share the same configuration — for example, they can use different `add_temporal_runs` settings.
14 changes: 14 additions & 0 deletions temporalio/contrib/langsmith/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
"""LangSmith integration for Temporal SDK.

This package provides LangSmith tracing integration for Temporal workflows,
activities, and other operations. It includes automatic run creation and
context propagation for distributed tracing in LangSmith.
"""

from temporalio.contrib.langsmith._interceptor import LangSmithInterceptor
from temporalio.contrib.langsmith._plugin import LangSmithPlugin

__all__ = [
"LangSmithInterceptor",
"LangSmithPlugin",
]
Loading
Loading