fix(analyze): preserve sample indices and fix threshold truthiness by ryan-arman · Pull Request #2377 · oumi-ai/oumi

ryan-arman · 2026-04-17T22:17:42Z

Description

Part of the analyze v2 split of #2370 (Phase 3 of 4). Standalone fixes — not dependent on PRs #2375 / #2376.

Two related fixes to the analyze test engine:

Preserve original sample indices. TestEngine._extract_metric_values now returns (original_index, value) pairs so that sample_indices and all_affected_indices on the resulting TestResult point to real conversation positions. Previously, when a sample was missing the metric (filtered out), the reported indices were offsets into the filtered list, which no longer matched the dataset — the user was told to look at conversation 2 when the problem was really in conversation 3.
Fix threshold truthiness bug. threshold=test.max_percentage or test.min_percentage silently fell through to min_percentage when max_percentage was 0.0. The result's reported threshold would say "0% required" instead of the configured "0% allowed." Replaced with an explicit is not None check in both the in-memory TestEngine and the incremental BatchTestEngine.

Includes regression tests for both fixes in tests/unit/analyze/test_testing_engine.py.

Related issues

Linear Issue: OPE-1868
Towards OPE-1868

Before submitting

This PR only changes documentation. (You can ignore the following checks in that case)
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

🤖 Generated with Claude Code

Two related fixes to the analyze test engine: 1. ``_extract_metric_values`` now returns ``(original_index, value)`` pairs so that ``sample_indices`` and ``all_affected_indices`` on a ``TestResult`` point to real conversation positions. Previously, when a sample was missing the metric (filtered out), the reported indices were offsets into the filtered list, which no longer matched the dataset. 2. ``threshold=test.max_percentage or test.min_percentage`` silently fell through to ``min_percentage`` when ``max_percentage`` was ``0.0``. Use an explicit ``is not None`` check instead. Applied to both the in-memory ``TestEngine`` and the incremental ``BatchTestEngine``. Adds regression tests for both behaviors. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

gitar-bot · 2026-04-17T22:17:46Z

Reviewing your code

_Gitar

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(analyze): preserve sample indices and fix threshold truthiness#2377

fix(analyze): preserve sample indices and fix threshold truthiness#2377
ryan-arman wants to merge 1 commit intomainfrom
ryan-arman/analyze-test-engine-fixes

ryan-arman commented Apr 17, 2026

Uh oh!

gitar-bot bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ryan-arman commented Apr 17, 2026

Description

Related issues

Before submitting

Uh oh!

gitar-bot bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

gitar-bot bot commented Apr 17, 2026 •

edited

Loading