Conversation
The default check is an O(snapshot) scan of every manifest entry, which dominated the commit hot path on busy tables and produced a multiplicative slowdown over time (T6692). Each parquet file path already carries a fresh uuid, so the check is redundant for this writer.
|
Commits Review LGTM |
gocritic ruleguard wants gerund-form error wrapping. Pre-existing issue on main blocking unrelated CI runs.
|
Commits Review LGTM |
Summary
doCommitnow passestable.WithoutDuplicateCheck()alongside the existingtable.WithoutAutoNameMapping()when callingTransaction.AddDataFiles. By defaulticeberg-goperforms an O(snapshot) scan of every manifest entry in the current snapshot on everyAddDataFilescall to detect path collisions with files being added. The writer in this package already stamps every parquet path with a fresh UUID, so the guarantee is held by the caller and the scan is wasted work.This addresses ticket T6692, where the customer's Iceberg writer collapsed to approximately 40 messages/sec on a busy table — roughly 150× slower than the same workload on Kafka Connect (around 6,000 messages/sec). Profiling provided by the customer showed 88.6 GB of allocations (46% of total) inside
iceberg-go.(*ManifestReader).ReadEntry, with the committer goroutine continuously blocked indoCommit. The dup-check scan is the exclusive driver of that hot path.The check still runs in
iceberg-gofor any other caller that does not opt out, so this change is scoped strictly to this writer's commit path.Measurements
Benchmark added in
internal/impl/iceberg/committer_test.go(BenchmarkAddDataFilesDupCheck). Apple M3 Pro,-benchtime=20x, singleTransaction.AddDataFilescall against a table pre-seeded withseedseparate single-file commits:The
dup_check=oncost scales linearly with the number of manifest entries in the current snapshot; thedup_check=offcost is effectively flat. The 150× figure at seed=1,000 lines up with the customer's observed RPCN-vs-Kafka-Connect throughput gap, supporting the diagnosis that this single check accounted for the dominant share of the regression.Run locally with:
Why this has not surfaced for other customers
The cost is
O(manifest_entries_in_current_snapshot) × commits_per_second × concurrent_pods. For most workloads at least one of those terms is small enough to mask the dup-check overhead entirely. The reporting customer happened to maximise all three at once:batch.count=10000, period=1s, producing roughly one commit/sec/pod. The dup-check runs per commit, so the cost-per-second scales linearly with commit rate. Customers who batch more aggressively amortise the scan across many more messages.Test plan
TestCommitterSkipsDuplicateCheckadded ininternal/impl/iceberg/committer_test.go. It commits the same path twice through the committer and asserts both calls succeed. Verified locally that the test fails with the expectedcannot add files that are already referenced by tableerror when the option is removed.go test -count=1 ./internal/impl/iceberg/...— all packages pass.bin/golangci-lint run ./internal/impl/iceberg/...— clean.