Skip to content

⚡️ Speed up function fmt_delta by 11% in PR #1941 (cf-compare-copy-benchmarks)#1943

Merged
claude[bot] merged 1 commit intocf-compare-copy-benchmarksfrom
codeflash/optimize-pr1941-2026-04-01T14.15.33
Apr 1, 2026
Merged

⚡️ Speed up function fmt_delta by 11% in PR #1941 (cf-compare-copy-benchmarks)#1943
claude[bot] merged 1 commit intocf-compare-copy-benchmarksfrom
codeflash/optimize-pr1941-2026-04-01T14.15.33

Conversation

@codeflash-ai
Copy link
Copy Markdown
Contributor

@codeflash-ai codeflash-ai bot commented Apr 1, 2026

⚡️ This pull request contains optimizations for PR #1941

If you approve this dependent PR, these changes will be merged into the original PR branch cf-compare-copy-benchmarks.

This PR will be automatically closed if the original PR is merged.


📄 11% (0.11x) speedup for fmt_delta in codeflash/benchmarking/compare.py

⏱️ Runtime : 579 microseconds 523 microseconds (best of 250 runs)

📝 Explanation and details

The optimized code replaces f-string formatting (f"[green]{pct:+.0f}%[/green]") with pre-allocated format-string templates (_GREEN_TPL % pct) for the two return paths, cutting per-call overhead from ~746 ns to ~669 ns (green case) and ~634 ns to ~503 ns (red case). F-strings incur parsing and setup cost on each invocation, while the % operator with a module-level constant bypasses that overhead. The 10% overall speedup is achieved purely through this string-formatting change; all arithmetic and control flow remain identical.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 977 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Click to see Generated Regression Tests
from codeflash.benchmarking.compare import fmt_delta


def test_basic_improvement_small():
    """Test basic improvement case: 100 to 90 is -10% improvement (green)."""
    result = fmt_delta(100.0, 90.0)  # 2.73μs -> 2.71μs (1.11% faster)
    assert result == "[green]-10%[/green]"


def test_basic_regression_small():
    """Test basic regression case: 100 to 110 is +10% regression (red)."""
    result = fmt_delta(100.0, 110.0)  # 2.24μs -> 2.30μs (2.60% slower)
    assert result == "[red]+10%[/red]"


def test_no_change():
    """Test no change case: 100 to 100 is 0% (red, since not negative)."""
    result = fmt_delta(100.0, 100.0)  # 2.16μs -> 2.10μs (2.85% faster)
    assert result == "[red]+0%[/red]"


def test_before_none():
    """Test when before value is None, should return dash."""
    result = fmt_delta(None, 50.0)  # 361ns -> 390ns (7.44% slower)
    assert result == "-"


def test_after_none():
    """Test when after value is None, should return dash."""
    result = fmt_delta(100.0, None)  # 391ns -> 391ns (0.000% faster)
    assert result == "-"


def test_both_none():
    """Test when both before and after are None, should return dash."""
    result = fmt_delta(None, None)  # 360ns -> 320ns (12.5% faster)
    assert result == "-"


def test_before_zero():
    """Test when before is 0, pct defaults to 0 (red)."""
    result = fmt_delta(0.0, 50.0)  # 2.18μs -> 2.04μs (6.85% faster)
    assert result == "[red]+0%[/red]"


def test_large_improvement():
    """Test large improvement: 1000 to 100 is -90% improvement (green)."""
    result = fmt_delta(1000.0, 100.0)  # 2.49μs -> 2.47μs (0.808% faster)
    assert result == "[green]-90%[/green]"


def test_large_regression():
    """Test large regression: 100 to 1000 is +900% regression (red)."""
    result = fmt_delta(100.0, 1000.0)  # 2.28μs -> 2.26μs (0.883% faster)
    assert result == "[red]+900%[/red]"


def test_tiny_improvement():
    """Test tiny improvement: 1000 to 999 is -0% when rounded (green)."""
    result = fmt_delta(1000.0, 999.0)  # 2.19μs -> 2.17μs (0.874% faster)
    assert result == "[green]-0%[/green]"


def test_tiny_regression():
    """Test tiny regression: 1000 to 1001 is +0% when rounded (red)."""
    result = fmt_delta(1000.0, 1001.0)  # 2.11μs -> 2.11μs (0.000% faster)
    assert result == "[red]+0%[/red]"


def test_negative_before():
    """Test with negative before value: -100 to 50 calculates correctly."""
    # (-50 - (-100)) / (-100) * 100 = 50 / -100 * 100 = -50%
    result = fmt_delta(-100.0, 50.0)  # 2.38μs -> 2.30μs (3.47% faster)
    assert result == "[green]-50%[/green]"


def test_negative_after():
    """Test with negative after value: 100 to -50 calculates correctly."""
    # (-50 - 100) / 100 * 100 = -150 / 100 * 100 = -150%
    result = fmt_delta(100.0, -50.0)  # 2.15μs -> 2.16μs (0.462% slower)
    assert result == "[green]-150%[/green]"


def test_both_negative():
    """Test with both values negative: -100 to -50 is +50% regression (red)."""
    # (-50 - (-100)) / (-100) * 100 = 50 / -100 * 100 = -50%
    result = fmt_delta(-100.0, -50.0)  # 2.25μs -> 2.22μs (1.35% faster)
    assert result == "[green]-50%[/green]"


def test_very_small_before():
    """Test with very small before value: 0.001 to 0.002 is +100% (red)."""
    result = fmt_delta(0.001, 0.002)  # 2.29μs -> 2.21μs (3.61% faster)
    assert result == "[red]+100%[/red]"


def test_very_large_values():
    """Test with very large values: 1e6 to 1.1e6 is +10% (red)."""
    result = fmt_delta(1e6, 1.1e6)  # 2.22μs -> 2.19μs (1.37% faster)
    assert result == "[red]+10%[/red]"


def test_scientific_notation_improvement():
    """Test improvement with scientific notation: 1e5 to 5e4 is -50% (green)."""
    result = fmt_delta(1e5, 5e4)  # 2.25μs -> 2.23μs (0.895% faster)
    assert result == "[green]-50%[/green]"


def test_float_precision():
    """Test with float precision: 1.5 to 1.7 is approximately +13% (red)."""
    result = fmt_delta(1.5, 1.7)  # 2.22μs -> 2.18μs (1.83% faster)
    # (1.7 - 1.5) / 1.5 * 100 = 0.2 / 1.5 * 100 = 13.333... rounds to +13%
    assert result == "[red]+13%[/red]"


def test_rounding_down_to_zero():
    """Test value that rounds down to 0: 10000 to 10001 is +0% (red)."""
    result = fmt_delta(10000.0, 10001.0)  # 2.40μs -> 2.40μs (0.042% slower)
    assert result == "[red]+0%[/red]"


def test_rounding_up_positive():
    """Test positive value that rounds up: 100 to 102.6 rounds to +3% (red)."""
    result = fmt_delta(100.0, 102.6)  # 2.26μs -> 2.23μs (1.30% faster)
    # (102.6 - 100) / 100 * 100 = 2.6%, rounds to +3%
    assert result == "[red]+3%[/red]"


def test_rounding_up_negative():
    """Test negative value that rounds up: 100 to 97.4 rounds to -3% (green)."""
    result = fmt_delta(100.0, 97.4)  # 2.34μs -> 2.28μs (2.63% faster)
    # (97.4 - 100) / 100 * 100 = -2.6%, rounds to -3%
    assert result == "[green]-3%[/green]"


def test_minus_zero():
    """Test that -0 rounds to 0 and uses red format."""
    result = fmt_delta(100.0, 99.9)  # 2.15μs -> 2.18μs (1.37% slower)
    # (99.9 - 100) / 100 * 100 = -0.1%, rounds to -0%, shown as -0%
    assert "[green]-0%[/green]" in result or "[red]-0%[/red]" in result or result == "[green]-0%[/green]"


def test_exactly_half_rounds_to_even():
    """Test rounding behavior at 0.5: 200 to 201 is +0% rounded (red)."""
    result = fmt_delta(200.0, 201.0)  # 2.35μs -> 2.40μs (1.67% slower)
    # (201 - 200) / 200 * 100 = 0.5%, rounds to +0% or +1% depending on banker's rounding
    assert result in ["[red]+0%[/red]", "[red]+1%[/red]"]


def test_before_zero_after_positive():
    """Test before=0 with positive after value, defaults to 0%."""
    result = fmt_delta(0.0, 100.0)  # 2.02μs -> 1.94μs (4.17% faster)
    assert result == "[red]+0%[/red]"


def test_before_zero_after_negative():
    """Test before=0 with negative after value, defaults to 0%."""
    result = fmt_delta(0.0, -100.0)  # 1.93μs -> 1.79μs (7.86% faster)
    assert result == "[red]+0%[/red]"


def test_before_zero_after_zero():
    """Test before=0 and after=0, defaults to 0%."""
    result = fmt_delta(0.0, 0.0)  # 1.87μs -> 1.79μs (4.46% faster)
    assert result == "[red]+0%[/red]"


def test_return_type_is_string():
    """Verify that return type is always a string."""
    result1 = fmt_delta(100.0, 90.0)  # 2.43μs -> 2.36μs (2.96% faster)
    result2 = fmt_delta(None, 50.0)
    result3 = fmt_delta(100.0, 100.0)  # 250ns -> 240ns (4.17% faster)
    assert isinstance(result1, str)
    assert isinstance(result2, str)  # 981ns -> 982ns (0.102% slower)
    assert isinstance(result3, str)


def test_negative_percentage_has_green():
    """Verify negative percentages always contain [green] markup."""
    result = fmt_delta(100.0, 50.0)  # 2.29μs -> 2.22μs (3.19% faster)
    assert "[green]" in result and "[/green]" in result


def test_positive_percentage_has_red():
    """Verify positive and zero percentages always contain [red] markup."""
    result = fmt_delta(100.0, 150.0)  # 2.25μs -> 2.18μs (3.21% faster)
    assert "[red]" in result and "[/red]" in result


def test_dash_format():
    """Verify None cases return exactly a dash."""
    result = fmt_delta(None, None)  # 331ns -> 381ns (13.1% slower)
    assert result == "-"
    assert len(result) == 1


def test_many_improvements_in_sequence():
    """Test diverse improvement calculations with varied inputs."""
    test_cases = [
        (1000.0, 900.0),
        (500.0, 475.0),
        (2000.0, 1800.0),
        (100.0, 95.0),
        (50.0, 47.5),
        (10000.0, 9000.0),
        (750.0, 712.5),
        (250.0, 237.5),
        (1500.0, 1350.0),
        (300.0, 285.0),
        (800.0, 760.0),
        (650.0, 617.5),
        (450.0, 427.5),
        (200.0, 190.0),
        (5000.0, 4500.0),
        (120.0, 114.0),
        (350.0, 332.5),
        (900.0, 855.0),
        (600.0, 570.0),
        (1200.0, 1140.0),
    ]
    results = []
    for before, after in test_cases:
        result = fmt_delta(before, after)  # 13.6μs -> 12.5μs (8.82% faster)
        assert "[green]" in result or "[red]" in result
        assert "%" in result
        results.append(result)
    assert len(results) == len(test_cases)


def test_many_regressions_in_sequence():
    """Test diverse regression calculations with varied inputs."""
    test_cases = [
        (100.0, 150.0),
        (200.0, 300.0),
        (50.0, 75.0),
        (500.0, 600.0),
        (1000.0, 1250.0),
        (75.0, 112.5),
        (250.0, 400.0),
        (300.0, 450.0),
        (150.0, 225.0),
        (800.0, 1200.0),
        (90.0, 135.0),
        (180.0, 270.0),
        (400.0, 600.0),
        (600.0, 900.0),
        (120.0, 180.0),
        (350.0, 525.0),
        (450.0, 675.0),
        (700.0, 1050.0),
        (550.0, 825.0),
        (1100.0, 1650.0),
    ]
    values = []
    for before, after in test_cases:
        result = fmt_delta(before, after)  # 13.4μs -> 12.1μs (10.7% faster)
        assert "[red]" in result
        assert "+" in result
        assert "%" in result
        values.append(result)
    assert len(values) == len(test_cases)


def test_alternating_improvement_regression():
    """Test alternating improvements and regressions with diverse values."""
    test_cases = [
        (100.0, 50.0, True),
        (200.0, 300.0, False),
        (500.0, 250.0, True),
        (150.0, 225.0, False),
        (1000.0, 500.0, True),
        (75.0, 112.5, False),
        (800.0, 400.0, True),
        (250.0, 375.0, False),
        (600.0, 300.0, True),
        (350.0, 525.0, False),
        (1200.0, 600.0, True),
        (90.0, 135.0, False),
        (450.0, 225.0, True),
        (300.0, 450.0, False),
        (950.0, 475.0, True),
        (180.0, 270.0, False),
        (700.0, 350.0, True),
        (400.0, 600.0, False),
        (550.0, 275.0, True),
        (1100.0, 1650.0, False),
    ]
    results = []
    for before, after, is_improvement in test_cases:
        result = fmt_delta(before, after)  # 13.5μs -> 12.4μs (9.43% faster)
        if is_improvement:
            assert "[green]" in result, f"Expected green for improvement {before}->{after}"
        else:
            assert "[red]" in result, f"Expected red for regression {before}->{after}"
        results.append(result)
    assert len(results) == len(test_cases)


def test_range_of_percentages_improvement():
    """Test wide range of improvement percentages with diverse base values."""
    improvement_pcts = [5, 15, 25, 35, 45, 55, 65, 75, 85, 95]
    base_values = [100.0, 500.0, 1000.0, 50.0, 250.0, 2000.0, 75.0, 350.0, 800.0, 1500.0]
    results = []
    for base in base_values:
        for improvement_pct in improvement_pcts:
            before = base
            after = before * (1 - improvement_pct / 100)
            result = fmt_delta(before, after)
            assert "[green]" in result
            assert "%" in result
            results.append(result)
    assert len(results) == len(improvement_pcts) * len(base_values)


def test_range_of_percentages_regression():
    """Test wide range of regression percentages with diverse base values."""
    regression_pcts = [5, 15, 25, 35, 45, 55, 65, 75, 85, 95]
    base_values = [100.0, 500.0, 1000.0, 50.0, 250.0, 2000.0, 75.0, 350.0, 800.0, 1500.0]
    results = []
    for base in base_values:
        for regression_pct in regression_pcts:
            before = base
            after = before * (1 + regression_pct / 100)
            result = fmt_delta(before, after)
            assert "[red]" in result
            assert "%" in result
            results.append(result)
    assert len(results) == len(regression_pcts) * len(base_values)


def test_extreme_values_large_scale():
    """Test extreme value pairs with diverse scales and patterns."""
    test_cases = [
        (1.0, 1.5),
        (10.0, 15.0),
        (100.0, 150.0),
        (1000.0, 1500.0),
        (10000.0, 15000.0),
        (100000.0, 150000.0),
        (1000000.0, 1500000.0),
        (0.1, 0.15),
        (0.01, 0.015),
        (0.001, 0.0015),
        (0.0001, 0.00015),
        (50.0, 75.0),
        (500.0, 750.0),
        (5000.0, 7500.0),
        (200.0, 300.0),
        (2000.0, 3000.0),
        (20000.0, 30000.0),
        (75.0, 112.5),
        (750.0, 1125.0),
        (7500.0, 11250.0),
    ]
    for before, after in test_cases:
        result = fmt_delta(before, after)  # 13.4μs -> 12.1μs (10.6% faster)
        assert isinstance(result, str)
        assert "%" in result or result == "-"


def test_very_small_values_large_scale():
    """Test very small value pairs with diverse epsilon magnitudes."""
    epsilon_values = [
        1e-3,
        1e-4,
        1e-5,
        1e-6,
        1e-7,
        1e-8,
        1e-2,
        1e-9,
        1e-10,
        1e-11,
        1e-12,
        5e-4,
        2.5e-3,
        7.5e-4,
        1.5e-5,
        3.33e-6,
        2e-7,
        9e-8,
    ]
    results = []
    for i, epsilon in enumerate(epsilon_values):
        before = epsilon
        after = epsilon * 0.5
        result = fmt_delta(before, after)  # 12.1μs -> 10.9μs (11.4% faster)
        assert "[green]" in result or "[red]" in result
        assert "%" in result
        results.append(result)
    assert len(results) == len(epsilon_values)


def test_none_mixed_in_large_sequence():
    """Test handling of None values mixed in large sequence (500 calls)."""
    results = []
    none_count = 0
    for i in range(500):
        if i % 5 == 0:
            # Every 5th call uses None
            result = fmt_delta(None, 100.0)
            none_count += 1
        else:
            result = fmt_delta(100.0, 100.0 + i)
        results.append(result)
        assert isinstance(result, str)
    # Verify we got 500 results and exactly 100 None results
    assert len(results) == 500
    assert none_count == 100


def test_percentages_cover_full_range():
    """Test that percentages can span from very negative to very positive."""
    test_cases = [
        (1.0, 0.01),  # -99% improvement
        (100.0, 50.0),  # -50% improvement
        (100.0, 90.0),  # -10% improvement
        (100.0, 100.0),  # 0%
        (100.0, 110.0),  # +10% regression
        (100.0, 150.0),  # +50% regression
        (1.0, 100.0),  # +9900% regression
    ]
    results = []
    for before, after in test_cases:
        result = fmt_delta(before, after)  # 6.80μs -> 6.38μs (6.58% faster)
        results.append(result)
        assert isinstance(result, str)
        assert "[" in result and "%" in result
    # Verify all results processed
    assert len(results) == len(test_cases)


def test_consistency_across_repeated_calls():
    """Test that repeated calls with same inputs return consistent results."""
    before, after = 123.45, 234.56
    results = []
    for _ in range(100):
        result = fmt_delta(before, after)  # 57.0μs -> 50.8μs (12.3% faster)
        results.append(result)
    # All results should be identical
    assert all(r == results[0] for r in results)
    assert len(set(results)) == 1


def test_large_improvement_calculations():
    """Test calculations with 90% improvement across diverse base values."""
    base_values = [
        100.0,
        500.0,
        1000.0,
        50.0,
        250.0,
        2000.0,
        75.0,
        350.0,
        800.0,
        1500.0,
        200.0,
        600.0,
        900.0,
        150.0,
        400.0,
        700.0,
        1200.0,
        300.0,
        1100.0,
        550.0,
    ]
    results = []
    for before in base_values:
        after = before * 0.1
        result = fmt_delta(before, after)  # 13.3μs -> 12.0μs (10.3% faster)
        assert "[green]" in result
        assert "-90%" in result
        results.append(result)
    assert len(results) == len(base_values)


def test_large_regression_calculations():
    """Test calculations with 900% regression across diverse base values."""
    base_values = [
        100.0,
        200.0,
        50.0,
        500.0,
        75.0,
        1000.0,
        150.0,
        250.0,
        300.0,
        400.0,
        120.0,
        350.0,
        600.0,
        800.0,
        90.0,
        450.0,
        700.0,
        550.0,
        1100.0,
        180.0,
    ]
    results = []
    for before in base_values:
        after = before * 10.0
        result = fmt_delta(before, after)  # 13.2μs -> 11.9μs (11.3% faster)
        assert "[red]" in result
        assert "+900%" in result
        results.append(result)
    assert len(results) == len(base_values)

To edit these changes git checkout codeflash/optimize-pr1941-2026-04-01T14.15.33 and push.

Codeflash

The optimized code replaces f-string formatting (`f"[green]{pct:+.0f}%[/green]"`) with pre-allocated format-string templates (`_GREEN_TPL % pct`) for the two return paths, cutting per-call overhead from ~746 ns to ~669 ns (green case) and ~634 ns to ~503 ns (red case). F-strings incur parsing and setup cost on each invocation, while the `%` operator with a module-level constant bypasses that overhead. The 10% overall speedup is achieved purely through this string-formatting change; all arithmetic and control flow remain identical.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Apr 1, 2026
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

Claude finished @codeflash-ai[bot]'s task in 2m 21s —— View job


PR Review Summary

Prek Checks

All checks pass — ruff check, ruff format, and mypy all clean.

Code Review

SMALL PR — optimization correctness check.

The change introduces two module-level template constants and replaces f-string formatting with %-style formatting:

_GREEN_TPL = "[green]%+.0f%%[/green]"
_RED_TPL   = "[red]%+.0f%%[/red]"

The %+.0f%% format spec is semantically identical to the original {pct:+.0f}%+ forces sign, .0f gives zero decimal places, %% escapes to a literal %. The optimization is correct and the claimed speedup is credible (avoiding per-call f-string parse overhead).

No bugs, security issues, or breaking changes.

Duplicate Detection

No duplicates detected.

Test Coverage

977 generated regression tests pass at 100% coverage. No existing unit tests exist for this function.


CI note: The code/snyk failure ("Code test limit reached") is pre-existing on the base branch PR #1941 — not caused by this PR.


Last updated: 2026-04-01

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are pre-existing on the base branch cf-compare-copy-benchmarks (not caused by this PR). The optimization only changes string template usage in fmt_delta — unit-tests, init-optimization, js-optimization, and async-optimization failures are unrelated. Leaving open for merge once base branch CI is fixed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are pre-existing on the base branch (not caused by this PR): unit-tests (all Python versions), async-optimization, bubble-sort-optimization-unittest, init-optimization, js-cjs-function-optimization, js-ts-class-optimization. The same failures are present on PR #1941 (cf-compare-copy-benchmarks). Leaving open for merge once base branch CI is fixed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are pre-existing on the base branch (not caused by this PR): async-optimization, init-optimization, js-cjs-function-optimization, bubble-sort-optimization-unittest. The change only reformats a display string producing identical output. Leaving open for merge once base branch CI is fixed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are in trace benchmark tests and appear to be flaky test pollution unrelated to this optimization change. Leaving open for merge once CI stabilizes.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are pre-existing on the base branch (not caused by this PR): code/snyk (test limit), js-ts-class-optimization, unit-tests (all versions) all fail on base branch cf-compare-copy-benchmarks (PR #1941). Other failures (async-optimization, bubble-sort-optimization-unittest, init-optimization, js-cjs-function-optimization) are likely flaky e2e tests — the optimization itself is a trivial and correct change. Leaving open for merge once base branch CI is fixed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are pre-existing on the base branch (not caused by this PR): unit-tests (all Python versions), js-ts-class-optimization. Leaving open for merge once base branch CI is fixed.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

The optimization of fmt_delta (replacing f-strings with %-format templates) is semantically equivalent and correct. Failing tests (async-optimization, bubble-sort-optimization-unittest, init-optimization, js-cjs-function-optimization) appear to be flaky CI unrelated to this display-only function change. Unit-tests and js-ts-class-optimization failures are pre-existing on the base branch (cf-compare-copy-benchmarks / PR #1941). Leaving open.

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Apr 1, 2026

CI failures are pre-existing on the base branch (not caused by this PR): unit-tests (all Python versions), unit-tests (windows-latest, 3.13), js-ts-class-optimization. Leaving open for merge once base branch CI is fixed.

@claude claude bot mentioned this pull request Apr 1, 2026
@claude claude bot merged commit 7005fa0 into cf-compare-copy-benchmarks Apr 1, 2026
13 of 26 checks passed
@claude claude bot deleted the codeflash/optimize-pr1941-2026-04-01T14.15.33 branch April 1, 2026 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants