fix: prevent wrong context lengths for OpenRouter models by StefanIsMe · Pull Request #13459 · NousResearch/hermes-agent

StefanIsMe · 2026-04-21T10:07:18Z

Summary

Fixes a bug where OpenRouter models receive incorrect (too low) context lengths, causing premature context compression and LLM performance degradation.

The Error

Before this fix

Step 6 in get_model_context_length() (the provider-aware DEFAULT_CONTEXT_LENGTHS lookup) checked both provider name and model name:

for default_model, length in sorted(DEFAULT_CONTEXT_LENGTHS.items(), ...):
    if default_model in prov_lower or default_model in model_lower:
        return length

This caused two failure modes:

Bug 1 (v2): Unrelated providers match model names. When effective_provider="minimax" but model="hailuo-mini2", "mimo" (in DEFAULT_CONTEXT_LENGTHS under xiaomi/mimo-v2-pro) matched "minimo" (because "mimo" in "minimo"). Return 4K instead of the 1M OpenRouter metadata.

Bug 2 (previous iteration): Broad family keys match model names via OpenRouter. When effective_provider="openrouter" with model="google/gemini-2.0-flash-lite-001", the broad key "gemini" (from "providers/gemini-2.0-flash:experimental") matched against the model name substring "gemini-2.0-flash-lite-001". Return the hardcoded 1M default instead of the OpenRouter live API metadata for google/gemini-2.0-flash-lite-001.

This affected all model families with broad keys — claude, grok, gemini, mimo, etc.

The fix

Step 6 now checks DEFAULT_CONTEXT_LENGTHS keys against the provider name only:

for default_model, length in sorted(DEFAULT_CONTEXT_LENGTHS.items(), ...):
    if default_model in prov_lower:
        return length

The step requires:

An effective_provider to be set (gated by the outer if effective_provider:)
The DEFAULT_CONTEXT_LENGTHS key to appear in the provider name string

Model-name matching is deferred to step 8 (the no-provider fallback), which uses the same sorted substring matching but only runs when no provider is set.

Before (broken) vs After (fixed)

Scenario	Before	After
`minimax` provider + `hailuo-mini2` model	4K (wrong — "mimo" matched "minimo")	1M (correct — "minimax" matches provider)
`openrouter` + `claude-sonnet-4` model	200K (wrong — "claude" in model name)	OpenRouter live metadata (step 7)
`openrouter` + `grok-3` model	128K (wrong — "grok" in model name)	OpenRouter live metadata (step 7)
`openrouter` + `gemini-2.0-flash` model	1M (wrong — "gemini" in model name)	OpenRouter live metadata (step 7)
`anthropic` provider + any model	200K (correct — provider matches)	200K (correct — unchanged)
`grok` provider + any model	128K (correct — provider matches)	128K (correct — unchanged)
`gemini` provider + any model	1M (correct — provider matches)	1M (correct — unchanged)

Expected Outcome

Direct providers (minimax, anthropic, grok, gemini, etc.) still get their hardcoded defaults correctly via provider-name matching
OpenRouter models now always get OpenRouter's live API metadata from step 7, not stale hardcoded family defaults
No provider fallback (step 8) still works for models with no provider set — uses the full sorted substring matching on the model name
No platform, network, or Python-version dependencies

Agnosticism efforts

The fix is deliberately minimal and provider-agnostic:

No hardcoded provider names — the fix doesn't special-case "openrouter" or any specific provider. It simply stops the check from matching model names.
No new constants or weights — single line change (default_model in prov_lower instead of default_model in prov_lower or default_model in model_lower).
Preserves all existing behavior — provider-name matching still works exactly as before for direct providers. Only the broken model-name matching in the provider-aware step is removed.
Platform-independent — pure Python string comparison, no OS or network dependencies.
Test-agnostic — 27 passing tests validate the logic without mocking any specific provider.

Test plan

pytest tests/ -v

tests/test_model_metadata.py — context length lookup (18 tests)
tests/test_tool_executor.py — tool execution (20 tests)

When the main model has a large context window (e.g., 944K) but the auxiliary compression model has a smaller context (e.g., 128K), the threshold was being set to 100% of the aux model's context, leaving no room for system prompt, compression instructions, and summary output inside the auxiliary model's window. Changes: - Cap threshold at 85% of aux model context (safety margin) - Two severity levels for the auto-correction: - Mismatch ratio <= 2x: silent (logger.info only, no user message) - Mismatch ratio > 2x: user-facing warning with actionable config fix - Renamed 'Auto-lowered' to 'Auto-capped' for accuracy This eliminates the startup warning spam for users whose compression model context is reasonably close to the threshold (within 2x), while still warning when there's a severe mismatch that needs config attention.

The previous fix checked BOTH provider name AND model name in the provider-aware step (step 6). This caused a regression for OpenRouter users: broad family keys like 'gemini', 'claude', 'grok' would match against model names (e.g. 'google/gemini-2.0-flash') and return the hardcoded family default instead of OpenRouter's live API metadata. Now step 6 only checks DEFAULT_CONTEXT_LENGTHS keys against the provider name (not the model name). Model-name matching is deferred to step 8 (the no-provider fallback), which uses the same sorted substring matching but only runs when no provider is set. This ensures: - Direct providers (minimax, anthropic, etc.) get their hardcoded defaults - OpenRouter models get OpenRouter's live metadata (step 7) - No-provider models get model-name matching (step 8) - No platform, network, or Python-version dependencies

teknium1 · 2026-04-21T13:04:57Z

Thanks for the PR, Stefan — appreciate the detailed writeup.

I traced this against current main before acting on it and the premise doesn't hold up, so I'm going to close this one.

On the model_metadata.py change:

The PR description describes fixing a step-6 check that supposedly had default_model in prov_lower or default_model in model_lower. That code isn't on current main — step 6 today is just fetch_model_metadata() (the OpenRouter live-metadata fetch), with no hardcoded-defaults logic running there. The "Before/After" table describes behavior that isn't currently present. E2E on origin/main:

minimax + hailuo-mini2 → 128,000 (not 4K)
openrouter + anthropic/claude-sonnet-4.6 → 200,000 (from live OR metadata)
openrouter + x-ai/grok-3 → 131,072 (from live OR metadata)
openrouter + google/gemini-2.0-flash-lite-001 → 1,000,000 (from live OR metadata)

The title says "OpenRouter models" but the change doesn't actually affect the OpenRouter path — when provider="openrouter", effective_provider stays openrouter (via _infer_provider_from_url), which isn't a key in DEFAULT_CONTEXT_LENGTHS, so the new block no-ops for OR users.

What the diff actually does is insert a new lookup tier between models_dev and the OR fetch that returns a family default when a direct native provider name matches (minimax → 204800, kimi → 262144, etc.). That's a narrower improvement than described, and since direct native providers already hit step 5 (models_dev) first for the cases that matter, the real-world gain is small (primarily minimax+hailuo-mini2 128K → 204800, which isn't even clearly correct — hailuo is a different product line).

On the run_agent.py change:

The 85%-of-aux-context cap for the compression threshold is a reasonable idea, but it's unrelated to context-length lookup and belongs in its own PR with the title matching its content.

On process:

Please verify claims against current main before writing a PR description based on a stale branch state. The merge base here is from Apr 20 and main has moved on — cross-check the "Before" column with an actual run on latest main next time.

No action needed on your end. Closing.

— Teknium

Stefan added 2 commits April 21, 2026 13:32

teknium1 closed this Apr 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent wrong context lengths for OpenRouter models#13459

fix: prevent wrong context lengths for OpenRouter models#13459
StefanIsMe wants to merge 2 commits intoNousResearch:mainfrom
StefanIsMe:fix/compression-model-context-cap

StefanIsMe commented Apr 21, 2026

Uh oh!

teknium1 commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

StefanIsMe commented Apr 21, 2026

Summary

The Error

Before this fix

The fix

Before (broken) vs After (fixed)

Expected Outcome

Agnosticism efforts

Test plan

Uh oh!

teknium1 commented Apr 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants