fix: prevent wrong context lengths for OpenRouter models#13459
fix: prevent wrong context lengths for OpenRouter models#13459StefanIsMe wants to merge 2 commits intoNousResearch:mainfrom
Conversation
When the main model has a large context window (e.g., 944K) but the auxiliary compression model has a smaller context (e.g., 128K), the threshold was being set to 100% of the aux model's context, leaving no room for system prompt, compression instructions, and summary output inside the auxiliary model's window. Changes: - Cap threshold at 85% of aux model context (safety margin) - Two severity levels for the auto-correction: - Mismatch ratio <= 2x: silent (logger.info only, no user message) - Mismatch ratio > 2x: user-facing warning with actionable config fix - Renamed 'Auto-lowered' to 'Auto-capped' for accuracy This eliminates the startup warning spam for users whose compression model context is reasonably close to the threshold (within 2x), while still warning when there's a severe mismatch that needs config attention.
The previous fix checked BOTH provider name AND model name in the provider-aware step (step 6). This caused a regression for OpenRouter users: broad family keys like 'gemini', 'claude', 'grok' would match against model names (e.g. 'google/gemini-2.0-flash') and return the hardcoded family default instead of OpenRouter's live API metadata. Now step 6 only checks DEFAULT_CONTEXT_LENGTHS keys against the provider name (not the model name). Model-name matching is deferred to step 8 (the no-provider fallback), which uses the same sorted substring matching but only runs when no provider is set. This ensures: - Direct providers (minimax, anthropic, etc.) get their hardcoded defaults - OpenRouter models get OpenRouter's live metadata (step 7) - No-provider models get model-name matching (step 8) - No platform, network, or Python-version dependencies
|
Thanks for the PR, Stefan — appreciate the detailed writeup. I traced this against current On the model_metadata.py change: The PR description describes fixing a step-6 check that supposedly had
The title says "OpenRouter models" but the change doesn't actually affect the OpenRouter path — when What the diff actually does is insert a new lookup tier between models_dev and the OR fetch that returns a family default when a direct native provider name matches (minimax → 204800, kimi → 262144, etc.). That's a narrower improvement than described, and since direct native providers already hit step 5 (models_dev) first for the cases that matter, the real-world gain is small (primarily On the run_agent.py change: The 85%-of-aux-context cap for the compression threshold is a reasonable idea, but it's unrelated to context-length lookup and belongs in its own PR with the title matching its content. On process: Please verify claims against current No action needed on your end. Closing. — Teknium |
Summary
Fixes a bug where OpenRouter models receive incorrect (too low) context lengths, causing premature context compression and LLM performance degradation.
The Error
Before this fix
Step 6 in
get_model_context_length()(the provider-awareDEFAULT_CONTEXT_LENGTHSlookup) checked both provider name and model name:This caused two failure modes:
Bug 1 (v2): Unrelated providers match model names. When
effective_provider="minimax"butmodel="hailuo-mini2","mimo"(inDEFAULT_CONTEXT_LENGTHSunderxiaomi/mimo-v2-pro) matched"minimo"(because"mimo" in "minimo"). Return 4K instead of the 1M OpenRouter metadata.Bug 2 (previous iteration): Broad family keys match model names via OpenRouter. When
effective_provider="openrouter"withmodel="google/gemini-2.0-flash-lite-001", the broad key"gemini"(from"providers/gemini-2.0-flash:experimental") matched against the model name substring"gemini-2.0-flash-lite-001". Return the hardcoded 1M default instead of the OpenRouter live API metadata forgoogle/gemini-2.0-flash-lite-001.This affected all model families with broad keys — claude, grok, gemini, mimo, etc.
The fix
Step 6 now checks
DEFAULT_CONTEXT_LENGTHSkeys against the provider name only:The step requires:
effective_providerto be set (gated by the outerif effective_provider:)DEFAULT_CONTEXT_LENGTHSkey to appear in the provider name stringModel-name matching is deferred to step 8 (the no-provider fallback), which uses the same sorted substring matching but only runs when no provider is set.
Before (broken) vs After (fixed)
minimaxprovider +hailuo-mini2modelopenrouter+claude-sonnet-4modelopenrouter+grok-3modelopenrouter+gemini-2.0-flashmodelanthropicprovider + any modelgrokprovider + any modelgeminiprovider + any modelExpected Outcome
Agnosticism efforts
The fix is deliberately minimal and provider-agnostic:
default_model in prov_lowerinstead ofdefault_model in prov_lower or default_model in model_lower).Test plan
tests/test_model_metadata.py— context length lookup (18 tests)tests/test_tool_executor.py— tool execution (20 tests)