Skip to content

perf(pm): add hyperfine --warmup 1 to phase-bench#2908

Merged
fireairforce merged 1 commit into
nextfrom
chore/phases-bench-warmup
May 7, 2026
Merged

perf(pm): add hyperfine --warmup 1 to phase-bench#2908
fireairforce merged 1 commit into
nextfrom
chore/phases-bench-warmup

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

Why

The phases bench iterates PMs in fixed order (`utoo, utoo-next, utoo-npm, bun` per `PM_LIST`) within each phase. The first PM pays the cold-network tax — DNS resolver miss, TLS session ticket cold, npm CDN edge POP unpopulated — and `--runs 3` averages that cold iteration into the wall mean. PMs that run later inherit the warm state for free.

A standalone smoke test (`pm-bench-pcap.sh` from #2906, identical-binary run on `chore/pcap-install-phase`) confirmed the magnitude: the same utoo binary, run twice back-to-back, showed +3s wall and +267 MB pcap delta on `p3_install` between the first and second run — pure CDN/DNS/TLS warm-up effect, no code difference.

Folded through `--runs 3` averaging that's roughly +1s on the first PM's p0 mean, the same magnitude as the per-PR deltas on #2903 / #2904 / #2905 (TTY-gate +0.98, streaming +0.49, zero-copy +1.18). Those means are currently indistinguishable from ordering bias.

Change

```diff
if ! hyperfine \
--runs "$RUNS" \

  • --warmup 1 \
    --prepare "bash $prep_script" \
    ```

`hyperfine --warmup 1` runs one untimed iteration through `--prepare` before the timed runs start, so each PM enters its measurement window with DNS / TLS / CDN already warm. Cross-PM ordering bias in the timed window collapses.

Cost: one extra iteration per PM × phase (≈ 33% bench job wall), well within tolerance.

Smoke test

PR is benchmark-labeled. Both `utoo` (this branch) and `utoo-next` (origin/next) compile from the same Rust code on this PR — only the bench script differs — so any `utoo-vs-utoo-next` p0 mean delta in the resulting comment is the residual ordering bias post-warmup. Target is ~0s.

Companion

🤖 Generated with Claude Code

The bench iterates PMs in fixed order (`utoo, utoo-next, utoo-npm,
bun` per `PM_LIST`) within each phase. The first PM pays the
cold-network tax — DNS resolver miss, TLS session ticket cold, npm
CDN edge POP unpopulated — and `--runs 3` averages that cold
iteration into the wall mean. PMs that run later inherit the warm
state for free.

A standalone smoke test confirmed the magnitude: the same utoo
binary, run twice back-to-back through `pm-bench-pcap.sh`, showed
~3s wall and 270 MB pcap-size delta on `p3_install` between the
first and second run, attributed to CDN/DNS/TLS warm-up. Folded
through `--runs 3` averaging that's roughly +1s on the first PM's
mean — the same magnitude as the per-PR p0 deltas observed on
#2903 / #2904 / #2905, which makes those mean-deltas indistinguishable
from ordering bias.

`--warmup 1` makes hyperfine run one untimed iteration through
`--prepare` before the timed runs start, so each PM enters its
measurement window with the network path warmed independently.
The cross-PM ordering bias in the LATER measurements collapses
because every PM starts from a primed-network state (the first
PM's warmup primes the runner; subsequent PMs' warmups cost nothing
extra but hold the same shape).

This costs one extra iteration per PM × phase (≈ 33% wall increase
on the bench job, well within tolerance), and is independent of
the script's PM-ordering question — that one we'll address
separately with round-robin if `--warmup` alone doesn't close the
gap.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@elrrrrrrr elrrrrrrr added benchmark Run pm-bench on PR A-Pkg Manager Area: Package Manager labels May 7, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a warmup phase to the hyperfine benchmarking command in bench/pm-bench-phases.sh to mitigate network latency bias during performance testing. Feedback suggests shortening the extensive inline comment for better readability, as the detailed rationale is already captured in the commit history.

Comment thread bench/pm-bench-phases.sh
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 7, 2026

📊 pm-bench-phases · 05b1c9a · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 9.40s 0.41s 10.06s 10.02s 740M 339.3K
utoo-next 8.15s 0.49s 10.76s 12.23s 1.32G 185.9K
utoo-npm 8.27s 0.42s 10.76s 12.29s 1.41G 192.7K
utoo 8.69s 0.75s 10.78s 12.56s 1.47G 175.9K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 14.2K 16.3K 1.17G 6M 1.84G 1.72G 1M
utoo-next 117.9K 80.7K 1.14G 4M 1.68G 1.68G 2M
utoo-npm 124.5K 85.7K 1.14G 4M 1.68G 1.68G 2M
utoo 133.4K 87.0K 1.14G 5M 1.68G 1.68G 2M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 3.31s 2.53s 4.00s 1.02s 505M 171.7K
utoo-next 3.17s 0.13s 5.38s 1.85s 600M 78.8K
utoo-npm 3.01s 0.05s 5.29s 1.86s 597M 83.5K
utoo 3.34s 0.53s 5.31s 1.90s 599M 89.5K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 7.6K 4.5K 202M 3M 105M - 1M
utoo-next 67.1K 113.2K 198M 2M 7M 3M 2M
utoo-npm 65.7K 104.9K 198M 2M 7M 3M 2M
utoo 65.6K 105.1K 198M 2M 7M 3M 2M

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.65s 0.28s 6.08s 9.73s 632M 212.7K
utoo-next 7.37s 1.80s 5.28s 11.27s 963M 121.9K
utoo-npm 7.84s 2.93s 5.25s 11.25s 801M 114.9K
utoo 5.97s 0.14s 5.18s 10.88s 898M 124.3K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 4.2K 6.0K 994M 3M 1.74G 1.74G 1M
utoo-next 108.5K 70.4K 964M 3M 1.67G 1.67G 2M
utoo-npm 110.3K 72.0K 964M 3M 1.67G 1.67G 2M
utoo 88.2K 62.3K 964M 2M 1.67G 1.67G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 3.28s 0.09s 0.20s 2.31s 135M 31.7K
utoo-next 2.19s 0.21s 0.51s 3.73s 80M 18.5K
utoo-npm 2.06s 0.15s 0.51s 3.80s 84M 19.4K
utoo 2.21s 0.04s 0.49s 3.80s 81M 18.6K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 303 23 5M 14K 1.88G 1.72G 1M
utoo-next 40.3K 18.0K 320K 7K 1.68G 1.68G 2M
utoo-npm 45.9K 20.9K 335K 27K 1.68G 1.68G 2M
utoo 42.5K 19.7K 320K 7K 1.68G 1.68G 2M

npmmirror.com: no output captured.

@fireairforce fireairforce merged commit 662e048 into next May 7, 2026
32 of 46 checks passed
@fireairforce fireairforce deleted the chore/phases-bench-warmup branch May 7, 2026 07:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants