Skip to content

Add agent team tracing support#22

Open
sufjanfana wants to merge 1 commit intomainfrom
feat/team-agent-tracing
Open

Add agent team tracing support#22
sufjanfana wants to merge 1 commit intomainfrom
feat/team-agent-tracing

Conversation

@sufjanfana
Copy link
Copy Markdown

Add agent team tracing support

  • Add full tracing support for Claude Code agent teams (lead, teammates, subagents) with dedup, lifecycle tracking, and deferred turn emission
  • Rewrite token counting from per-line jq loops to a single parse_transcript pipeline with requestId-based dedup
  • Correct OpenInference span kind values and attribute names
  • Add two new hooks: TeammateIdle and TaskCompleted (11 hooks total)

What changed

Team-aware Turn lifecycle (stop.sh, common.sh, user_prompt_submit.sh, session_end.sh)

  • When a team is active, stop.sh defers the Turn span instead of emitting it. close_active_turn() in common.sh handles emission when the next prompt arrives or the session ends.
  • last_trace_id / last_trace_span_id are preserved after Turn emission so late-arriving teammate hooks (SubagentStop, TaskCompleted) can find their parent trace.

Agent grouping spans (post_tool_use.sh, subagent_stop.sh)

  • post_tool_use.sh lazy-inits an AGENT grouping span per agent_id and re-parents tool spans under it. Handles TeamCreate/TeamDelete detection, shutdown_response grouping, and a race fix that preemptively marks agents active on Agent/SendMessage calls.
  • subagent_stop.sh reuses the pre-created grouping span when the agent used tools, or creates one if it didn't. Teammate vs subagent distinction for span naming and start-time estimation. Transcript offset tracking avoids re-parsing for teammates with multiple work periods.
  • A sentinel (agent_<id>_shutdown_complete) prevents duplicate grouping spans between the two hooks.

Team dedup (common.sh)

  • check_team_dedup(): teammate marks first, lead checks and skips. Prevents duplicate spans when both fire for the same event. Used in post_tool_use and task_completed.

New hooks

  • teammate_idle.sh — Caches team_name in session state (no span emitted)
  • task_completed.sh — Emits CHAIN span with task metadata, team context, and dedup
  • plugin.json — Registers both hooks, updates description to 11 hooks

Token counting rewrite (common.sh, stop.sh, subagent_stop.sh)

  • New parse_transcript() replaces per-line while loops with a single jq -rsc pipeline (O(N) jq invocations → O(1))
  • Deduplicates streaming token counts by requestId — only the final cumulative count per group is used
  • Includes cache_read_input_tokens and cache_creation_input_tokens in totals

Span kind / attribute corrections (breaking)

  • Uppercased OpenInference span kinds: "tool""TOOL", "chain""CHAIN", "LLM""AGENT"
  • Renamed subagent attributes: subagent.idagent.id, subagent.typeagent.name
  • Added agent.role (lead/teammate/subagent) and team.name to all span types
  • Phoenix REST path now reads span_kind from attributes instead of hardcoding "CHAIN"
  • Phoenix attribute conversion includes doubleValue

Performance improvements

  • get_timestamp_ms: tries date +%s%3Nperl -MTime::HiRespython3date +%s000 (avoids python3 on most systems)
  • generate_trace_id / generate_span_id helpers replace repeated inline pipelines
  • Batched state reads in post_tool_use.sh (single jq call for 8+ fields)
  • del_states() replaces sequential del_state calls
  • get_or_set_state() atomic check-then-set in one lock
  • init_state idempotency guard
  • send_span accepts span name to skip jq parse for logging
  • get_state / del_state / inc_state use --arg instead of string interpolation

Other changes

  • build_span JSON-escapes span names (handles quotes, backslashes, tabs)
  • Output truncation raised from 5KB to 10KB; prompt capture raised from 1KB to 10KB
  • session_start.sh hard-resets state file to prevent stale state
  • New tool cases in post_tool_use: Agent, SendMessage, TaskCreate/TaskUpdate, ToolSearch
  • notification.sh filters idle_prompt notifications
  • jq -njq -nc across all hooks
  • jq error URL updated to platform-agnostic link

Breaking changes

Existing Arize/Phoenix dashboards that filter on openinference.span.kind values ("tool", "chain", "LLM") or subagent.* attributes will need to be updated to the new uppercase/renamed values.

Test plan

  • Single-agent session: verify Turn, tool, and subagent spans emit correctly (no regression)
  • Team session: verify lead Turn is deferred until next prompt or session end
  • Team session: verify teammate tool spans are deduped (no duplicates from lead)
  • Team session: verify AGENT grouping spans appear for each teammate
  • Verify TeammateIdle caches team_name without emitting a span
  • Verify TaskCompleted emits a CHAIN span with correct parent
  • Phoenix: verify span_kind reflects actual span kind (TOOL, AGENT, CHAIN) not hardcoded CHAIN
  • Dry run mode: ARIZE_DRY_RUN=true prints correct span structure
  • Python Agent SDK: lazy init still works (SessionStart doesn't fire)

🤖 Generated with Claude Code

- Team-aware Turn lifecycle: defer emission while team is active
- Agent grouping spans with shutdown_response sentinel
- check_team_dedup protocol for lead/teammate dedup
- New hooks: TeammateIdle, TaskCompleted (11 total)
- parse_transcript replaces per-line jq loops (O(N) → O(1))
- Uppercase OpenInference span kinds, rename subagent.* → agent.*
- Performance: batched state reads, del_states, get_or_set_state

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant