fix: preserve tool_calls and tool_call_id through message processing#1024
Open
limey wants to merge 2 commits intoBlaizzy:mainfrom
Open
fix: preserve tool_calls and tool_call_id through message processing#1024limey wants to merge 2 commits intoBlaizzy:mainfrom
limey wants to merge 2 commits intoBlaizzy:mainfrom
Conversation
Multi-turn tool calling looped because two independent passes stripped
tool-calling metadata before the tokenizer's Jinja chat template ran:
1. server.py chat_completions_endpoint rebuilt each message as a plain
{role, content} dict, silently dropping tool_calls on assistant
messages and tool_call_id on tool-result messages.
2. prompt_utils.py apply_chat_template routed all dict messages through
_get_role_content → get_message_json, neither of which carry
tool_calls or tool_call_id, so a second stripping pass occurred even
if server.py were fixed independently.
Fix 1: accumulate into a local `msg` dict, then conditionally attach
tool_calls (serialised via model_dump if needed) and tool_call_id before
appending to processed_messages. Adds a previously-missing else branch
for unrecognised content types.
Fix 2: insert a pass-through branch in the list-processing loop before
the _get_role_content branch — any dict with tool_calls or role=="tool"
is appended as-is, reaching apply_chat_template → tokenizer intact.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…enizer
The OpenAI wire format stores function arguments as a JSON string.
Gemma 4's Jinja chat template expects a native object. When passed a
string, the template embeds it verbatim in the model context, so on the
next turn the model mirrors it back using <|"|> escapes around the JSON
fragments rather than around individual string values. The parser then
decodes those fragments literally, producing double-encoded arguments
(e.g. key '{"date"' with value '"2026-04-14"}').
Parse arguments from JSON string back to dict during tool_calls
serialisation in processed_messages. Falls back to the original string
if json.loads fails, so non-JSON argument strings are not silently lost.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Multi-turn tool calling loops because two independent passes strip tool metadata before the tokenizer's Jinja chat template runs, and a third bug causes double-encoded arguments on subsequent tool calls.
Bug 1 —
server.pychat_completions_endpoint(~line 1068)The message loop rebuilds each message as a plain
{role, content}dict. For an assistant message withcontent: null, tool_calls: [...]this produces{"role": "assistant", "content": ""}—tool_callsis gone. For arole: toolmessage,tool_call_idis similarly dropped.Bug 2 —
prompt_utils.pyapply_chat_template(~line 713)All dict messages are routed through
_get_role_content→get_message_json, neither of which carrytool_callsortool_call_id. Even if Bug 1 were fixed in isolation, this pass would strip the fields again.Bug 3 —
server.py:argumentspassed as JSON string instead of dictThe OpenAI wire format stores
function.argumentsas a JSON string. Gemma 4's Jinja chat template expects a native object. When passed a string, the template embeds it verbatim in the model context, so on the next turn the model mirrors it back using<|"|>escapes around the JSON fragments rather than around individual string values. The parser then decodes those fragments literally, producing double-encoded arguments — e.g.JSON.parsegives{ '{"date"': '"2026-04-14"}' }instead of{ date: '2026-04-14' }.Fixes
server.py(Bug 1): accumulate into a localmsgdict, then conditionally attachtool_callsandtool_call_idbefore appending. Also adds a previously-missingelsebranch for unrecognised content types.prompt_utils.py(Bug 2): insert a pass-through branch before the_get_role_contentbranch — any dict withtool_callsorrole == "tool"is appended as-is.server.py(Bug 3): parseargumentsfrom JSON string back to dict duringtool_callsserialisation. Falls back to the original string ifjson.loadsfails.All changes are non-breaking (no-tool conversations are unaffected).
Validation
Tested against
mlx-community/gemma-4-26b-a4b-it-4bitwith a multi-turn agentic workflow requiring three sequential tool calls (get_time_entries→get_holidays→get_bookings). Before: model looped on the first tool call. After: all three tool calls resolved correctly with results accumulated in history.