Skip to content

feat: add ChatCompletionsTransport + wire all default paths#13805

Merged
teknium1 merged 1 commit intomainfrom
hermes/hermes-d8d6444e
Apr 22, 2026
Merged

feat: add ChatCompletionsTransport + wire all default paths#13805
teknium1 merged 1 commit intomainfrom
hermes/hermes-d8d6444e

Conversation

@teknium1
Copy link
Copy Markdown
Contributor

Salvages #13447 with regression fixes and Kimi port.

Third transport — handles the default chat_completions api_mode used by ~16 OpenAI-compatible providers. Closes the main PR 5 of the transport refactor series (issue #13473).

Changes vs #13447

  • Preserve tool_call.extra_content (Gemini thought_signature) via ToolCall.provider_data — the original shim stripped it, causing 400 errors on multi-turn Gemini 3 thinking.
  • Preserve reasoning_content distinctly from reasoning (DeepSeek/Moonshot) so the thinking-prefill retry check still triggers.
  • Port Kimi/Moonshot quirks that landed on main after the original PR (32000 max_tokens default, top-level reasoning_effort, extra_body.thinking).
  • Skip the SimpleNamespace shim in the main normalize loop — for chat_completions, response.choices[0].message is already the right shape.

Impact

run_agent.py: -239 lines in _build_api_kwargs default branch.

Transport coverage

api_mode Transport build_kwargs normalize validate
anthropic_messages AnthropicTransport
codex_responses ResponsesApiTransport
chat_completions ChatCompletionsTransport
bedrock_converse — (PR #13467)

Validation

Result
New transport tests 39 pass (8 build_kwargs, 5 Kimi, 4 validate, 4 normalize, 3 cache, 3 basic)
tests/run_agent/ 885/885 pass (+ 15 skipped; the single test_concurrent_interrupt failure is a pre-existing flake on origin/main)
E2E — Gemini extra_content Live check with real openai.types.chat.ChatCompletionMessageToolCall: provider_data["extra_content"] preserved ✅
E2E — Kimi build_kwargs max_tokens=32000, reasoning_effort=high, extra_body.thinking={"type":"enabled"} ✅
E2E — Kimi thinking-off reasoning_effort omitted, thinking={"type":"disabled"} ✅
E2E — reasoning_content preserved separately in provider_data ✅

Closes #13447 (merging this credits @kshitijk4poor's original work).

Third concrete transport — handles the default 'chat_completions' api_mode used
by ~16 OpenAI-compatible providers (OpenRouter, Nous, NVIDIA, Qwen, Ollama,
DeepSeek, xAI, Kimi, custom, etc.). Wires build_kwargs + validate_response to
production paths.

Based on PR #13447 by @kshitijk4poor, with fixes:
- Preserve tool_call.extra_content (Gemini thought_signature) via
  ToolCall.provider_data — the original shim stripped it, causing 400 errors
  on multi-turn Gemini 3 thinking requests.
- Preserve reasoning_content distinctly from reasoning (DeepSeek/Moonshot) so
  the thinking-prefill retry check (_has_structured) still triggers.
- Port Kimi/Moonshot quirks (32000 max_tokens, top-level reasoning_effort,
  extra_body.thinking) that landed on main after the original PR was opened.
- Keep _qwen_prepare_chat_messages_inplace alive and call it through the
  transport when sanitization already deepcopied (avoids a second deepcopy).
- Skip the back-compat SimpleNamespace shim in the main normalize loop — for
  chat_completions, response.choices[0].message is already the right shape
  with .content/.tool_calls/.reasoning/.reasoning_content/.reasoning_details
  and per-tool-call .extra_content from the OpenAI SDK.

run_agent.py: -239 lines in _build_api_kwargs default branch extracted to the
transport. build_kwargs now owns: codex-field sanitization, Qwen portal prep,
developer role swap, provider preferences, max_tokens resolution (ephemeral >
user > NVIDIA 16384 > Qwen 65536 > Kimi 32000 > anthropic_max_output), Kimi
reasoning_effort + extra_body.thinking, OpenRouter/Nous/GitHub reasoning,
Nous product attribution tags, Ollama num_ctx, custom-provider think=false,
Qwen vl_high_resolution_images, request_overrides.

39 new transport tests (8 build_kwargs, 5 Kimi, 4 validate, 4 normalize
including extra_content regression, 3 cache stats, 3 basic). Tests/run_agent/
targeted suite passes (885/885 + 15 skipped; the 1 remaining failure is the
test_concurrent_interrupt flake present on origin/main).
@teknium1 teknium1 merged commit 83d86ce into main Apr 22, 2026
11 of 12 checks passed
@teknium1 teknium1 deleted the hermes/hermes-d8d6444e branch April 22, 2026 03:50
@alt-glitch alt-glitch added type/feature New feature or request P2 Medium — degraded but workaround exists comp/agent Core agent loop, run_agent.py, prompt builder labels Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P2 Medium — degraded but workaround exists type/feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants