Skip to content

⚡️ Speed up method TestResults.timing_coefficient_of_variation by 338% in PR #1949 (cf-1082-benchmark-noise-floor)#1955

Merged
claude[bot] merged 2 commits intocf-1082-benchmark-noise-floorfrom
codeflash/optimize-pr1949-2026-04-01T17.51.09
Apr 1, 2026
Merged

⚡️ Speed up method TestResults.timing_coefficient_of_variation by 338% in PR #1949 (cf-1082-benchmark-noise-floor)#1955
claude[bot] merged 2 commits intocf-1082-benchmark-noise-floorfrom
codeflash/optimize-pr1949-2026-04-01T17.51.09

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai bot commented Apr 1, 2026

⚡️ This pull request contains optimizations for PR #1949

If you approve this dependent PR, these changes will be merged into the original PR branch cf-1082-benchmark-noise-floor.

This PR will be automatically closed if the original PR is merged.


📄 338% (3.38x) speedup for TestResults.timing_coefficient_of_variation in codeflash/models/models.py

⏱️ Runtime : 245 microseconds 55.8 microseconds (best of 250 runs)

📝 Explanation and details

The hot-path timing_coefficient_of_variation() was replaced with Welford's single-pass algorithm to compute sample standard deviation and mean in one traversal instead of calling statistics.mean() and statistics.stdev() separately (which each iterate the list). Line profiler shows the original's statistics.stdev() consumed 47.6% of function runtime; the new _compute_sample_cv cuts that to 16.2% by eliminating redundant passes and reducing overhead from Python's general-purpose statistics module. Overall runtime drops 77% (245 µs → 55.8 µs), a key speedup in process_single_candidate where this method gates candidate evaluation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 76 Passed
🌀 Generated Regression Tests 4 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 80.0%
⚙️ Click to see Existing Unit Tests
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_critic.py::test_timing_coefficient_of_variation 85.3μs 16.4μs 419%✅
test_critic.py::test_timing_cv_multi_test_case 123μs 16.3μs 656%✅
🌀 Click to see Generated Regression Tests
# imports
from pathlib import Path

from codeflash.models.models import FunctionTestInvocation, InvocationId, TestResults
from codeflash.models.test_type import TestType


def test_timing_coefficient_of_variation_empty_test_results():
    test_results = TestResults()
    assert test_results.timing_coefficient_of_variation() == 0.0  # 1.57μs -> 1.57μs (0.000% faster)


def test_timing_coefficient_of_variation_only_none_runtimes():
    invocation_id1 = InvocationId(
        test_module_path="test_module",
        test_class_name=None,
        test_function_name="test_test1",
        function_getting_tested="func_under_test",
        iteration_id="1",
    )
    inv1 = FunctionTestInvocation(
        id=invocation_id1,
        loop_index=1,
        test_type=TestType.GENERATED_REGRESSION,
        verification_type="function_call",
        test_framework="pytest",
        runtime=None,
        did_pass=True,
        file_name=Path("test.py"),
        return_value=42,
        timed_out=False,
    )
    invocation_id2 = InvocationId(
        test_module_path="test_module",
        test_class_name=None,
        test_function_name="test_test2",
        function_getting_tested="func_under_test",
        iteration_id="1",
    )
    inv2 = FunctionTestInvocation(
        id=invocation_id2,
        loop_index=1,
        test_type=TestType.GENERATED_REGRESSION,
        verification_type="function_call",
        test_framework="pytest",
        runtime=None,
        did_pass=True,
        file_name=Path("test.py"),
        return_value=42,
        timed_out=False,
    )
    test_results = TestResults(test_results=[inv1, inv2])
    assert test_results.timing_coefficient_of_variation() == 0.0  # 17.1μs -> 17.3μs (0.875% slower)

To edit these changes git checkout codeflash/optimize-pr1949-2026-04-01T17.51.09 and push.

Codeflash Static Badge

The hot-path `timing_coefficient_of_variation()` was replaced with Welford's single-pass algorithm to compute sample standard deviation and mean in one traversal instead of calling `statistics.mean()` and `statistics.stdev()` separately (which each iterate the list). Line profiler shows the original's `statistics.stdev()` consumed 47.6% of function runtime; the new `_compute_sample_cv` cuts that to 16.2% by eliminating redundant passes and reducing overhead from Python's general-purpose statistics module. Overall runtime drops 77% (245 µs → 55.8 µs), a key speedup in `process_single_candidate` where this method gates candidate evaluation.
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Apr 1, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

Claude finished @codeflash-ai[bot]'s task in 5m 8s —— View job


PR Review Summary

Prek Checks

✅ Passes. No issues introduced by this PR. (14 pre-existing mypy errors in models.py unrelated to these changes.)

Code Review

Correctness: The Welford single-pass algorithm is implemented correctly:

  • n < 2 guard mirrors the original len(runtimes) < 2 check
  • mean == 0.0 guard mirrors the original mean == 0 check
  • Uses m2 / (n - 1) (sample variance) consistent with statistics.stdev (which also uses n-1)
  • The sample_variance <= 0.0 guard correctly handles floating-point rounding artefacts
  • statistics.median is still used for the final aggregation (local import statistics at line 992 is still needed)
  • import math added at top-level for math.sqrt — correct

Style fix applied: _compute_sample_cv was renamed to compute_sample_cv per the project naming convention (CLAUDE.md: "NEVER use leading underscores"). Committed and pushed.

No bugs, security issues, or breaking API changes found.

Duplicate Detection

No duplicates detected. compute_sample_cv is a unique local helper.

Other open codeflash-ai PRs


Last updated: 2026-04-01T17:57Z

@claude claude bot merged commit fc9a6b2 into cf-1082-benchmark-noise-floor Apr 1, 2026
24 of 26 checks passed
@claude claude bot deleted the codeflash/optimize-pr1949-2026-04-01T17.51.09 branch April 1, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants