Add agent team tracing support by sufjanfana · Pull Request #22 · Arize-ai/arize-claude-code-plugin

sufjanfana · 2026-04-06T12:38:13Z

Add agent team tracing support

Add full tracing support for Claude Code agent teams (lead, teammates, subagents) with dedup, lifecycle tracking, and deferred turn emission
Rewrite token counting from per-line jq loops to a single parse_transcript pipeline with requestId-based dedup
Correct OpenInference span kind values and attribute names
Add two new hooks: TeammateIdle and TaskCompleted (11 hooks total)

What changed

Team-aware Turn lifecycle (stop.sh, common.sh, user_prompt_submit.sh, session_end.sh)

When a team is active, stop.sh defers the Turn span instead of emitting it. close_active_turn() in common.sh handles emission when the next prompt arrives or the session ends.
last_trace_id / last_trace_span_id are preserved after Turn emission so late-arriving teammate hooks (SubagentStop, TaskCompleted) can find their parent trace.

Agent grouping spans (post_tool_use.sh, subagent_stop.sh)

post_tool_use.sh lazy-inits an AGENT grouping span per agent_id and re-parents tool spans under it. Handles TeamCreate/TeamDelete detection, shutdown_response grouping, and a race fix that preemptively marks agents active on Agent/SendMessage calls.
subagent_stop.sh reuses the pre-created grouping span when the agent used tools, or creates one if it didn't. Teammate vs subagent distinction for span naming and start-time estimation. Transcript offset tracking avoids re-parsing for teammates with multiple work periods.
A sentinel (agent_<id>_shutdown_complete) prevents duplicate grouping spans between the two hooks.

Team dedup (common.sh)

check_team_dedup(): teammate marks first, lead checks and skips. Prevents duplicate spans when both fire for the same event. Used in post_tool_use and task_completed.

New hooks

teammate_idle.sh — Caches team_name in session state (no span emitted)
task_completed.sh — Emits CHAIN span with task metadata, team context, and dedup
plugin.json — Registers both hooks, updates description to 11 hooks

Token counting rewrite (common.sh, stop.sh, subagent_stop.sh)

New parse_transcript() replaces per-line while loops with a single jq -rsc pipeline (O(N) jq invocations → O(1))
Deduplicates streaming token counts by requestId — only the final cumulative count per group is used
Includes cache_read_input_tokens and cache_creation_input_tokens in totals

Span kind / attribute corrections (breaking)

Uppercased OpenInference span kinds: "tool" → "TOOL", "chain" → "CHAIN", "LLM" → "AGENT"
Renamed subagent attributes: subagent.id → agent.id, subagent.type → agent.name
Added agent.role (lead/teammate/subagent) and team.name to all span types
Phoenix REST path now reads span_kind from attributes instead of hardcoding "CHAIN"
Phoenix attribute conversion includes doubleValue

Performance improvements

get_timestamp_ms: tries date +%s%3N → perl -MTime::HiRes → python3 → date +%s000 (avoids python3 on most systems)
generate_trace_id / generate_span_id helpers replace repeated inline pipelines
Batched state reads in post_tool_use.sh (single jq call for 8+ fields)
del_states() replaces sequential del_state calls
get_or_set_state() atomic check-then-set in one lock
init_state idempotency guard
send_span accepts span name to skip jq parse for logging
get_state / del_state / inc_state use --arg instead of string interpolation

Other changes

build_span JSON-escapes span names (handles quotes, backslashes, tabs)
Output truncation raised from 5KB to 10KB; prompt capture raised from 1KB to 10KB
session_start.sh hard-resets state file to prevent stale state
New tool cases in post_tool_use: Agent, SendMessage, TaskCreate/TaskUpdate, ToolSearch
notification.sh filters idle_prompt notifications
jq -n → jq -nc across all hooks
jq error URL updated to platform-agnostic link

Breaking changes

Existing Arize/Phoenix dashboards that filter on openinference.span.kind values ("tool", "chain", "LLM") or subagent.* attributes will need to be updated to the new uppercase/renamed values.

Test plan

Single-agent session: verify Turn, tool, and subagent spans emit correctly (no regression)
Team session: verify lead Turn is deferred until next prompt or session end
Team session: verify teammate tool spans are deduped (no duplicates from lead)
Team session: verify AGENT grouping spans appear for each teammate
Verify TeammateIdle caches team_name without emitting a span
Verify TaskCompleted emits a CHAIN span with correct parent
Phoenix: verify span_kind reflects actual span kind (TOOL, AGENT, CHAIN) not hardcoded CHAIN
Dry run mode: ARIZE_DRY_RUN=true prints correct span structure
Python Agent SDK: lazy init still works (SessionStart doesn't fire)

🤖 Generated with Claude Code

- Team-aware Turn lifecycle: defer emission while team is active - Agent grouping spans with shutdown_response sentinel - check_team_dedup protocol for lead/teammate dedup - New hooks: TeammateIdle, TaskCompleted (11 total) - parse_transcript replaces per-line jq loops (O(N) → O(1)) - Uppercase OpenInference span kinds, rename subagent.* → agent.* - Performance: batched state reads, del_states, get_or_set_state Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sufjanfana requested review from AparnaDhinakaran, duncankmckinnon and gabe0912 as code owners April 6, 2026 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add agent team tracing support#22

Add agent team tracing support#22
sufjanfana wants to merge 1 commit intomainfrom
feat/team-agent-tracing

sufjanfana commented Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sufjanfana commented Apr 6, 2026

Add agent team tracing support

What changed

Team-aware Turn lifecycle (stop.sh, common.sh, user_prompt_submit.sh, session_end.sh)

Agent grouping spans (post_tool_use.sh, subagent_stop.sh)

Team dedup (common.sh)

New hooks

Token counting rewrite (common.sh, stop.sh, subagent_stop.sh)

Span kind / attribute corrections (breaking)

Performance improvements

Other changes

Breaking changes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant