Skip to content

Fix false positive test discovery from substring matching#1974

Merged
Saga4 merged 4 commits intomainfrom
fix/false-positive-test-discovery
Apr 3, 2026
Merged

Fix false positive test discovery from substring matching#1974
Saga4 merged 4 commits intomainfrom
fix/false-positive-test-discovery

Conversation

@mohammedahmed18
Copy link
Copy Markdown
Contributor

@mohammedahmed18 mohammedahmed18 commented Apr 2, 2026

Bug 4: False Positive Test Discovery

Trace ID: 0b575a96-62a8-4910-b163-1ad10e60ba79

Issue

Test discovery incorrectly matched test files with source functions when the function name appeared anywhere in the test file, including in mocks, comments, or unrelated code. This caused Failed to instrument test file errors because Codeflash tried to instrument tests that don't actually test the target function.

Root Cause

In /opt/codeflash/codeflash/languages/javascript/support.py line 259, the condition:

if func.function_name in imported_names or func.function_name in source:

Used naive substring matching (func.function_name in source) which matched function names even when they were only mentioned in mocks:

vi.mock("./restart-request.js", () => ({
  parseRestartRequestParams: (params) => ({ sessionKey: undefined }),
}));

Example

  • Function: parseRestartRequestParams from restart-request.ts
  • Wrongly matched with: update.test.ts
  • Reason: Test file mocked it with vi.mock("./restart-request.js", () => ({ parseRestartRequestParams: ... }))
  • Result: Codeflash tried to instrument and failed: "Function 'parseRestartRequestParams' not found in test file"

Fix

Removed the substring matching condition. Now only matches functions that are explicitly imported from the source module. This is more reliable and avoids false positives.

Changes

  • Modified: codeflash/languages/javascript/support.py (line 257-266)
  • Added: tests/languages/javascript/test_false_positive_discovery.py (2 regression tests)

Testing

  • ✅ All 70 JavaScript tests pass
  • ✅ New regression tests pass (verify fix works)
  • ✅ Linting/type checks pass (uv run prek)

Impact

  • Eliminates false "Failed to instrument test file" warnings
  • Improves test discovery accuracy
  • Prevents wasted instrumentation attempts on irrelevant test files

Issue: Test discovery incorrectly matched test files with source functions
when the function name appeared anywhere in the test file, including in
mocks, comments, or unrelated code. This caused 'Failed to instrument test
file' errors.

Root cause: In javascript/support.py line 259, naive substring matching
(func.function_name in source) matched function names even when they were
only mentioned in mocks like:
  vi.mock('./file.js', () => ({ funcName: ... }))

Example: Function parseRestartRequestParams from restart-request.ts was
wrongly matched with update.test.ts because the test file mocked it.

Fix: Removed substring matching, now only matches explicitly imported
functions. This is more reliable and avoids false positives.

Trace ID: 0b575a96-62a8-4910-b163-1ad10e60ba79

Changes:
- Removed naive substring check in discover_tests()
- Only match functions that are explicitly imported
- Added regression tests (2 test cases)

Testing:
- All 70 JavaScript tests pass
- New tests verify fix works correctly
- Linting/type checks pass (uv run prek)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@CLAassistant
Copy link
Copy Markdown

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Codeflash Bot seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 2, 2026

Claude finished @mohammedahmed18's task in 3m 33s —— View job


PR Review

  • Triage PR size — SMALL (30 lines production code changed in support.py, +109 test lines)
  • Lint and typecheck — prek passes (ruff only); mypy reports errors in the test file (see below)
  • Resolve stale review threads — none found
  • Code review
  • Duplicate detection
  • Bot PR check

Prek Checks

uv run prek run --from-ref origin/main passes cleanly (ruff check + format both pass). No auto-fixes needed.

mypy reports errors on the test file for invalid constructor kwargs — details below.


Code Review

codeflash/languages/javascript/support.py — The core fix is correct. Removing func.function_name in source eliminates the substring-matching false positives. The new handling for default imports, namespace imports (with member-access regex), aliases, and class name → qualified name mapping all look sensible.

tests/languages/javascript/test_false_positive_discovery.py — The test file passes at runtime (confirmed), but it uses several constructor arguments that don't exist on FunctionToOptimize:

source_function = FunctionToOptimize(
    qualified_name="parseRestartRequestParams",          # ❌ not a constructor param — it's a @property
    function_name="parseRestartRequestParams",
    file_path=test_root / "restart-request.ts",
    starting_line=1,
    ending_line=10,
    function_signature="",                               # ❌ not a constructor param
    code_position=CodePosition(line_no=1, col_no=0),    # ❌ not a constructor param
    file_path_relative_to_project_root="restart-request.ts",  # ❌ not a constructor param
)

Pydantic silently drops the unknown kwargs (no runtime error), but CodePosition is imported and never actually used. The correct instantiation only needs function_name and file_path:

source_function = FunctionToOptimize(
    function_name="parseRestartRequestParams",
    file_path=test_root / "restart-request.ts",
)

Fix this →


Duplicate Detection

No duplicates detected. The discover_tests changes are JS-specific and the logic in languages/python/support.py uses a completely different (Python AST-based) approach.


Bot PR Check

One open codeflash-ai[bot] PR found: #1954 ("⚡️ Speed up TestResults.total_passed_runtime by 20%"). CI is fully passing but the PR has merge conflicts. It was created 2026-04-01 (2 days ago) — leaving open per the <3-day policy.


Saga4
Saga4 previously approved these changes Apr 3, 2026
mohammedahmed18 and others added 2 commits April 3, 2026 13:32
…S test discovery

The index-only approach missed tests for class methods (imported by class name),
aliased imports (only alias was tracked), and namespace imports (e.g. math.calculate).
Adds class_name→methods index, tracks both original+alias names, and extracts
namespace member access via regex.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mohammedahmed18 mohammedahmed18 requested a review from Saga4 April 3, 2026 12:10
@Saga4 Saga4 merged commit a756701 into main Apr 3, 2026
30 of 36 checks passed
@Saga4 Saga4 deleted the fix/false-positive-test-discovery branch April 3, 2026 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants