fix: reject Java optimizations with unused additions and unchanged target method by mashraf-222 · Pull Request #1947 · codeflash-ai/codeflash

mashraf-222 · 2026-04-01T15:54:41Z

Problem

The AI optimizer sometimes generates "optimizations" that add new fields or helper methods to a Java class without changing the target method at all. Because benchmark noise produces small timing variations, these fake optimizations pass the speedup critic and create PRs with no real improvement.

Example: 4 commons-lang PRs each added private static final Supplier<String> NULL_SUPPLIER = Suppliers.nul(); but the target methods (getJavaAwtHeadless, getJavaIoTmpdir, etc.) were never modified to use it — yet reported 7-151% speedups.

Root Cause

replace_function() in replacement.py accepts any optimization that changes the file, even if the target method body is identical to the original. The dedup check compares the entire candidate (function + helpers/fields), so adding a new field makes it "different" from the original, bypassing the identity check.

Fix

Added _has_unused_additions() in replacement.py that:

Compares the target method body before and after optimization (whitespace-normalized)
If unchanged AND new fields/helpers exist, checks if any addition identifiers are referenced in the target method body
Rejects the candidate if additions are unreferenced, returning the original source unchanged

This causes replace_function_definitions_for_language() to return False (no update), which skips the candidate.

Validation

Unit tests: 37/37 pass (5 new + 32 existing, zero regressions)
E2E: Fibonacci optimization ran without false rejections — no "unreferenced" warnings in logs
Session: /home/ubuntu/e2e-sessions/2026-04-01_15-45_cf1081-unused-additions/

Test Coverage

New TestUnusedAdditionsRejection class with 5 tests:

test_unchanged_method_with_unused_field_rejected — unchanged method + unused field → rejected
test_unchanged_method_with_unused_helper_rejected — unchanged method + unused helper → rejected
test_changed_method_with_used_field_accepted — changed method + used field → accepted
test_changed_method_without_additions_accepted — normal optimization → accepted
test_unchanged_method_with_used_helper_accepted — method uses new helper → accepted

Closes CF-1081

…rget method Adds a wiring check in replace_function() that detects when the AI generates "optimizations" adding fields/helpers that the target method never references. Previously these passed through because benchmark noise produced fake speedups. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude · 2026-04-01T15:55:31Z

Claude finished @mashraf-222's task in 2m 32s —— View job

PR Review Summary

Triage PR size — LARGE (71 lines production code, 237 lines tests)
Run lint/typecheck — clean (prek passed, mypy no issues)
Resolve stale threads — none found
Code review
Duplicate detection
Post summary

Prek Checks

All clean — ruff check, ruff format, and mypy pass with no issues.

Code Review

Bug: logger.info hardcodes "target" instead of the method name (replacement.py:207)

logger.info(
    "Rejecting optimization: target method '%s' is unchanged and new additions %s are unreferenced.",
    "target",       # ← always logs "target", not the actual method name
    addition_names,
)

The function _has_unused_additions doesn't receive func_name, so it can't log it. The logging is also misclassified — a rejection should be logger.warning, not logger.info (the call site at line 526–530 does use logger.warning, making the inner log misleading). Consider either passing func_name into _has_unused_additions or removing the inner log and relying solely on the outer logger.warning. Fix this →

Misleading test name (test_replacement.py)

test_unchanged_method_with_used_helper_accepted — the method in this test IS changed (return a + b → return fastAdd(a, b)). The test name says "unchanged method" but the method is changed. This could confuse future readers. The test itself is valid and exercises the right path (method changes + helper referenced), just the name is wrong.

modified_constructors not covered by the unused-additions check

_has_unused_additions only checks new_fields, helpers_before_target, and helpers_after_target. If the AI adds a constructor modification but leaves the target method unchanged, it won't be caught. This is probably an acceptable gap (constructor rewrites are less likely to be "fake"), but worth noting for completeness.

Logic correctness: the check is sound for the described cases

The identity check (whitespace-normalized comparison), the word-boundary regex search, and the find_fields/find_methods parsing via a dummy class wrapper are all correct. The placement of the check in replace_function — after target_method is found but before any mutations — is appropriate.

Duplicate Detection

No duplicates detected. Python and JavaScript language modules have no equivalent replacement.py — the unused-additions guard is Java-specific and appropriately placed in languages/java/replacement.py.

Test Coverage

All 5 new tests pass. The test suite covers the key matrix:

Scenario	Expected	Test
Unchanged method + unused field	Rejected	✅
Unchanged method + unused helper	Rejected	✅
Changed method + used field	Accepted	✅
Changed method, no additions	Accepted	✅
Changed method + referenced helper	Accepted	✅

One gap: no test for modified_constructors — if a constructor is modified with the target method unchanged, the check doesn't fire. Low priority but worth considering.

Last updated: 2026-04-01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reject Java optimizations with unused additions and unchanged target method#1947

fix: reject Java optimizations with unused additions and unchanged target method#1947
mashraf-222 wants to merge 1 commit intomainfrom
cf-1081-reject-unused-additions

mashraf-222 commented Apr 1, 2026

Uh oh!

claude bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mashraf-222 commented Apr 1, 2026

Problem

Root Cause

Fix

Validation

Test Coverage

Uh oh!

claude bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

Duplicate Detection

Test Coverage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Apr 1, 2026 •

edited

Loading