fix: return finish_reason=tool_calls when tool calls detected#990
Open
eloe wants to merge 2 commits intoBlaizzy:mainfrom
Open
fix: return finish_reason=tool_calls when tool calls detected#990eloe wants to merge 2 commits intoBlaizzy:mainfrom
eloe wants to merge 2 commits intoBlaizzy:mainfrom
Conversation
OpenAI-compatible clients check finish_reason to decide whether to enter the tool execution loop. Previously mlx-vlm always returned "stop" even when process_tool_calls found calls in the model output. Now both streaming and non-streaming /chat/completions responses return finish_reason="tool_calls" when tool calls are present, and "stop" otherwise. Adds 3 tests covering: stop without tools, tool_calls with tools, stop with tools but no calls made. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove unused asyncio import and add streaming test for finish_reason=tool_calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Per the OpenAI Chat Completions spec,
finish_reasonshould be"tool_calls"when the model emits tool calls, not"stop". Strict OpenAI-compatible clients (including agent frameworks like LangChain, CrewAI, and others) usefinish_reason == "tool_calls"as the branch condition for entering the tool-execution loop.Currently both streaming and non-streaming paths hardcode
"stop". This PR fixes both:ChatChoice):finish_reason = "tool_calls" if tool_calls.get("calls") else "stop"ChatStreamChoice): Final chunk uses"tool_calls"finish_reason when tools detectedTests
4 tests covering all combinations:
test_chat_completions_finish_reason_stop_no_tools— plain text, no tools definedtest_chat_completions_finish_reason_tool_calls— non-streaming with tool callstest_chat_completions_finish_reason_stop_tools_no_calls— tools defined but model doesn't call anytest_chat_completions_streaming_finish_reason_tool_calls— streaming with tool callsRelated
This overlaps with #964 by @michaelstingl which addresses the non-streaming path. This PR additionally covers the streaming path and includes 4 tests (streaming + non-streaming, with and without tool calls).