chore(bench): extend pcap to install phase + utoo-next baseline#2906
chore(bench): extend pcap to install phase + utoo-next baseline#2906elrrrrrrr wants to merge 4 commits into
Conversation
The pcap-only bench was previously a one-off that captured `p1_resolve` across `utoo` and `bun`, and assumed the project tree was already cloned by `pm-bench-phases.sh` running in the same job. That gave us metadata fan-out, but install-phase regressions (#2902 / #2903 / #2904 / #2905 σ widening on `p0_full_cold`) live in the tarball download path, not in resolve. This commit makes the pcap bench self-contained and covers both phases for three PMs: - Self-clone the project if `$PROJECT_DIR` is missing (mirrors `pm-bench-phases.sh`), so this script runs as a standalone CI job. - Add a `<pm>-install` capture per PM: lock pre-existing, `cache + node_modules` wiped, then `<pm> install`. This is the cold-tarball-download phase where the σ-widening lives. - Add `utoo-next` as a third PM: built upstream by `build-linux`'s bench-baseline step (now also gated on `pm-bench-pcap`), downloaded via the same artifact path as `bench-phases-linux`. Skipped in local runs where `$UTOO_NEXT_BIN` is unset. Workflow change: - `pm-bench-pcap-linux` now downloads the `utoo-next-linux-x64` artifact and exports `UTOO_NEXT_BIN` exactly like `bench-phases-linux` does. - `Build next branch utoo` and `Upload utoo-next binary` steps in `build-linux` now also fire for `inputs.target == 'pm-bench-pcap'`, not only `pm-bench-phases`. Outputs in `/tmp/pm-bench-pcap`: dns.txt utoo-{resolve,install}.{pcap,log} utoo-next-{resolve,install}.{pcap,log} (when UTOO_NEXT_BIN set) bun-{resolve,install}.{pcap,log} Drives the analysis of whether the install hot-path's increased concurrency (FuturesUnordered streaming, zero-copy tar, TTY-gate) saturates outbound TCP and starves the download path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the bench/pm-bench-pcap.sh script to support capturing network traces for both the resolve and install phases across different package managers, including a new baseline option for utoo-next. It introduces a run_pm_phases function to manage the lifecycle of these phases and ensures a cold environment by clearing caches. The review feedback recommends using Bash arrays to store common command-line arguments for bun and utoo to improve maintainability and reduce duplication.
| BUN_INSTALL_CACHE_DIR="$cache_dir" \ | ||
| capture_one "${pm_name}-resolve" \ | ||
| "$pm_bin" install --lockfile-only --registry="$REGISTRY" | ||
| rm -rf "$cache_dir" node_modules | ||
| BUN_INSTALL_CACHE_DIR="$cache_dir" \ | ||
| capture_one "${pm_name}-install" \ | ||
| "$pm_bin" install --registry="$REGISTRY" |
There was a problem hiding this comment.
The --registry argument and the BUN_INSTALL_CACHE_DIR environment variable prefix are duplicated for the bun commands. To improve maintainability and reduce redundancy, you could extract the common argument into an array. This will make future changes to arguments easier.
| BUN_INSTALL_CACHE_DIR="$cache_dir" \ | |
| capture_one "${pm_name}-resolve" \ | |
| "$pm_bin" install --lockfile-only --registry="$REGISTRY" | |
| rm -rf "$cache_dir" node_modules | |
| BUN_INSTALL_CACHE_DIR="$cache_dir" \ | |
| capture_one "${pm_name}-install" \ | |
| "$pm_bin" install --registry="$REGISTRY" | |
| local bun_args=(--registry="$REGISTRY") | |
| BUN_INSTALL_CACHE_DIR="$cache_dir" \ | |
| capture_one "${pm_name}-resolve" \ | |
| "$pm_bin" install --lockfile-only "${bun_args[@]}" | |
| rm -rf "$cache_dir" node_modules | |
| BUN_INSTALL_CACHE_DIR="$cache_dir" \ | |
| capture_one "${pm_name}-install" \ | |
| "$pm_bin" install "${bun_args[@]}" |
| capture_one "${pm_name}-resolve" \ | ||
| "$pm_bin" deps --registry="$REGISTRY" --cache-dir="$cache_dir" | ||
| rm -rf "$cache_dir" node_modules | ||
| capture_one "${pm_name}-install" \ | ||
| "$pm_bin" install --registry="$REGISTRY" --cache-dir="$cache_dir" |
There was a problem hiding this comment.
The arguments --registry and --cache-dir are duplicated across the utoo commands. You can extract these common arguments into an array to improve maintainability and reduce redundancy. This makes it easier to modify arguments in the future.
| capture_one "${pm_name}-resolve" \ | |
| "$pm_bin" deps --registry="$REGISTRY" --cache-dir="$cache_dir" | |
| rm -rf "$cache_dir" node_modules | |
| capture_one "${pm_name}-install" \ | |
| "$pm_bin" install --registry="$REGISTRY" --cache-dir="$cache_dir" | |
| local utoo_args=(--registry="$REGISTRY" --cache-dir="$cache_dir") | |
| capture_one "${pm_name}-resolve" \ | |
| "$pm_bin" deps "${utoo_args[@]}" | |
| rm -rf "$cache_dir" node_modules | |
| capture_one "${pm_name}-install" \ | |
| "$pm_bin" install "${utoo_args[@]}" |
Building on the install-phase pcap capture from the previous commit,
post-process each .pcap with tshark to extract pre-TLS metrics that
directly probe the "install greediness starves download" hypothesis
without needing TLS session-key dumping:
zero_windows — receive buffer full → server paused. Direct evidence
that the app's tokio runtime is not draining the
socket fast enough between extracts.
retransmits — server resent because ACK was late. Indirect
evidence of receive-side stall.
duplicate_acks — receiver re-sent ACK because it perceived a gap.
stream_gap_* — inter-packet gap distribution per TCP stream
(p50 / p99 / max in microseconds). p99 / max measure
the longest pause an active connection experienced —
if utoo shows multi-hundred-ms gaps where utoo-next
shows tens of ms, install is freezing the runtime
mid-download.
Per-capture summaries land at $PCAP_DIR/<name>.summary.json. They are
aggregated into a top-level summary.json via jq -s, so artifact
consumers can compare metrics across PMs without re-parsing the 100s
of MB of raw pcaps.
Single-pass tshark over the pcap with -T fields keeps cost bounded
to ~1 minute per 1 GB of capture; the full analysis pass runs after
all captures so it does not bleed into wall-clock measurement.
Workflow change:
Install pcap tools step now also installs tshark + jq, with
wireshark-common pre-seeded so tshark installs non-interactively
(we only read existing pcaps, no setuid dumpcap needed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The analysis pass aborted the whole job on the first .pcap because the wall-time grep returned no match, and `set -eo pipefail` propagated that exit-1 through `local x=$(grep | awk)` (the multi-line `local x; x=$(...)` form does NOT mask the exit code, unlike `local x=$(...)` on one line — bash gotcha). Two-part fix: 1. Drop into `set +e` / `set +o pipefail` for the analysis function body. The metrics are diagnostic — one tshark hiccup or an empty log line should not nuke a 25-minute capture run. Strict mode is restored at the end of the function so the rest of the script keeps its safety net. 2. Replace `grep -oE | awk` with awk-only. awk returns 0 even when no record matches, so empty-result log files no longer trip pipefail. Same parse, fewer pipes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…iagnosis
The TCP-level analysis (zero-copy retx=123 vs baseline 4-18) gave
strong evidence that utoo's receiver runtime is under back-pressure
during install, but it doesn't tell us *why*. The leading hypothesis
is disk IO saturation: rayon's parallel `fs::create + write_all` over
80k+ files in the ant-design tarball burst can outrun GitHub Actions
runners' Azure-disk IOPS budget, blocking write threads → tokio
threads back up → socket buffers fill.
This commit adds an iostat-x sampler to each capture:
capture_one() now spawns `iostat -x -y 1` in parallel with
tcpdump, writing per-second device samples to
$PCAP_DIR/<name>.iostat.txt. Both samplers are torn down with
the workload command.
analyze_pcap() parses the iostat log via column-position lookup
(sysstat header row → column index map) and extracts:
io_util_max_pct — peak disk-busy percentage
io_util_avg_pct — average disk-busy percentage
io_w_iops_max — peak write IOPS
io_w_kbs_max — peak write throughput (kB/s)
io_w_await_max_ms — peak write queue wait (ms)
io_samples — sample count for sanity check
These six fields land in summary.json alongside the TCP metrics, so
artifact consumers can directly cross-correlate disk pressure with
TCP back-pressure within the same capture window.
The decision rule:
* If io_util_max_pct stays high (>80%) on the experiment branch
while baseline same-PM utoo-next stays low → install path is
saturating disk and that's the mechanism.
* If both branches show similar low %util, disk is not the
bottleneck and we keep looking (e.g. CPU contention).
Workflow: apt install adds `sysstat` (iostat lives there). It is
preinstalled on ubuntu-latest images today, but pinning the
dependency makes future image rebuilds resilient.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
pcap tooling rolled into #2924 stack (ci(pcap): install/preload/manifest captures). Archived. |
What
The pcap-only bench was previously a one-off that captured
p1_resolveacrossutooandbun, and assumed the project was already cloned bypm-bench-phases.sh. Install-phase regressions (#2902 / #2903 / #2904 / #2905 σ widening onp0_full_cold) live in the tarball download path — not in resolve.This PR makes the pcap bench self-contained and covers both phases for three PMs.
Script changes (
bench/pm-bench-pcap.sh)\$PROJECT_DIRis missing (mirrorspm-bench-phases.sh)<pm>-installcapture per PM: lock pre-existing, cache +node_moduleswiped, then<pm> install. This is the cold-tarball-download phase where the σ-widening lives.utoo-nextas a third PM, gated on\$UTOO_NEXT_BINso local runs still work without it.Workflow changes (
.github/workflows/pm-e2e-bench.yml)pm-bench-pcap-linuxnow downloads theutoo-next-linux-x64artifact and exportsUTOO_NEXT_BIN, exactly likebench-phases-linuxdoes.Build next branch utooandUpload utoo-next binarysteps inbuild-linuxnow also fire forinputs.target == 'pm-bench-pcap', not onlypm-bench-phases.Outputs
```
/tmp/pm-bench-pcap/
dns.txt
utoo-{resolve,install}.{pcap,log}
utoo-next-{resolve,install}.{pcap,log} (when UTOO_NEXT_BIN set)
bun-{resolve,install}.{pcap,log}
```
Why
Drives the analysis of whether the install hot-path's increased concurrency (FuturesUnordered streaming, zero-copy tar, TTY-gate) saturates outbound TCP and starves the download path — the suspect mechanism for σ widening on
p0_full_coldacross the #2903 / #2904 / #2905 split experiments.Smoke test
Dispatched on this chore branch first: utoo binary and utoo-next binary built off the same code (just bench scripts differ), so any utoo-vs-utoo-next install-phase delta on this run is pure environment noise — establishes the floor before we run on the experiment branches.
🤖 Generated with Claude Code