Skip to content

fix imports

712ebc9
Select commit
Loading
Failed to load commit list.
Merged

feat(anthropic): Emit AI Client Spans for synchronous messages.stream() #5565

fix imports
712ebc9
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden completed Mar 13, 2026 in 6m 40s

7 issues

High

ValueError: _collect_ai_data returns 4 values but only 3 are unpacked - `sentry_sdk/integrations/anthropic.py:422-431`

The _collect_ai_data function returns a 4-tuple (model, usage, content_blocks, response_id) but the code at lines 422-431 only unpacks 3 values. This will cause a ValueError: too many values to unpack (expected 3) at runtime when processing any MessageStartEvent, MessageDeltaEvent, or other handled event type. The streaming integration will crash when iterating over response events.

Also found at:

  • sentry_sdk/integrations/anthropic.py:483-492
ValueError at runtime - unpacking 3 values from a 4-tuple - `sentry_sdk/integrations/anthropic.py:422-431`

The new _wrap_synchronous_message_iterator (line 393-452) and _wrap_asynchronous_message_iterator (line 455-513) call _collect_ai_data but only unpack 3 values (model, usage, content_blocks). However, _collect_ai_data (line 149) returns a 4-tuple including response_id. This will raise ValueError: too many values to unpack when the code executes. The existing implementations at lines 516+ correctly unpack all 4 values.

Also found at:

  • sentry_sdk/integrations/anthropic.py:393-452
  • sentry_sdk/integrations/anthropic.py:442-452

Low

Test missing assertion for GEN_AI_SYSTEM attribute - `tests/integrations/anthropic/test_anthropic.py:417-420`

The new test_stream_messages test is missing the assertion assert span["data"][SPANDATA.GEN_AI_SYSTEM] == "anthropic" which is present in all similar tests (e.g., test_streaming_create_message at line 306, test_nonstreaming_create_message at line 120). This reduces test coverage for verifying that the gen_ai.system attribute is correctly set on the span.

Also found at:

  • tests/integrations/anthropic/test_anthropic.py:840-843
  • tests/integrations/anthropic/test_anthropic.py:1797-1800
Test missing assertion for GEN_AI_RESPONSE_ID attribute - `tests/integrations/anthropic/test_anthropic.py:438`

The new test_stream_messages test is missing the assertion for SPANDATA.GEN_AI_RESPONSE_ID which is present in other streaming tests like test_streaming_create_message (line 325). This reduces test coverage for verifying that the response ID is correctly captured from the streaming response.

Incorrect comment references wrong test function name - `tests/integrations/anthropic/test_anthropic.py:2919`

The comment on line 2919 says # input_tokens should be total: 19 + 2846 = test_stream_messages_input_tokens_include_cache_read_streaming which appears to be a copy-paste error. The comment should explain the calculation (like the one on line 2985 that correctly says # input_tokens should be total: 19 + 2846 = 2865). This misleading comment could confuse future maintainers.

Span may not be closed if exception occurs after span.__enter__() - `sentry_sdk/integrations/anthropic.py:912-932`

In _wrap_message_stream_manager_enter, span.__enter__() is called at line 912 but there's no try/except/finally block to ensure the span is properly closed if an exception occurs in subsequent operations (lines 914-932). If _set_common_input_data() or the stream._iterator assignment raises an exception, the span will remain open, causing a potential resource leak. Other wrapper functions in this file (e.g., _sentry_patched_create_sync) use a finally block to close the span on error.

Test missing assertion for GEN_AI_SYSTEM span data - `tests/integrations/anthropic/test_anthropic.py:420-423`

The new test_stream_messages test does not assert that SPANDATA.GEN_AI_SYSTEM == "anthropic" is set on the span, unlike the similar test_streaming_create_message test (line 306). The implementation sets this via _set_common_input_data, but the test doesn't verify it, which could mask regressions.

Also found at:

  • tests/integrations/anthropic/test_anthropic.py:439
4 skills analyzed
Skill Findings Duration Cost
code-review 4 4m 7s $1.99
find-bugs 3 6m 32s $3.21
skill-scanner 0 4m 18s $0.62
security-review 0 2m 43s $0.90

Duration: 17m 40s · Tokens: 4.2M in / 51.7k out · Cost: $6.77 (+extraction: $0.01, +merge: $0.01, +fix_gate: $0.01, +dedup: $0.02)