bazel: Fix some DEBUG warnings#30364
Conversation
There was a problem hiding this comment.
Pull request overview
Updates Redpanda’s Bazel toolchain sysroot wiring to avoid Bazel 9 DEBUG warnings and reduce the number of individual sysroot files exposed to the build graph (improving caching/performance characteristics).
Changes:
- Removes per-file sysroot exposure via
http_archive+filegroup(glob(["*/**"]))from the non-module dependency repository setup. - Adds
@toolchains_llvm//toolchain:sysroot.bzlsysrootrepository rule instances forx86_64andaarch64(withexclude_patternsworkarounds). - Updates
llvm.sysroot(...)labels to reference the sysroot rule’s exported target.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| bazel/repositories.bzl | Removes the previous http_archive-based sysroot repos that exposed all sysroot files individually. |
| MODULE.bazel | Defines sysroot repos via toolchains_llvm’s sysroot repo rule and updates toolchain sysroot labels accordingly. |
| MODULE.bazel.lock | Updates extension digest and removes generated repo specs for the old sysroot repos. |
3c66189 to
2f12ed9
Compare
0990f57 to
9649920
Compare
CI test resultstest results on build#83986
|
dotnwat
left a comment
There was a problem hiding this comment.
no red flags from me
bazel: Strip symlinks from sysroot
on the first commit i'm confused why the broken symlinks aren't causing havoc in everyone's build since presumably we all would be accessing the same files when building?
Consider using the
sysrootrepository rule in @toolchains_llvm//toolchain:sysroot.bzl which provides a single-file (directory) sysroot for more efficient builds.
to clarify, the second commit is effectively duplicating that referenced repository rule with our own flavor?
| llvm.sysroot( | ||
| name = "{version}_llvm_toolchain".format(version = major_version), | ||
| label = "@x86_64_sysroot//:sysroot", | ||
| label = "@x86_64_sysroot//sysroot:sysroot", |
There was a problem hiding this comment.
huh, i thought this was strictly an short hand / alias thing
There was a problem hiding this comment.
The error says:
system_module_map.bzl:88:18: WARNING: Sysroot @@toolchains_llvm++llvm+current_llvm_toolchain//:sysroot-components-x86_64-linux resolved to 1689 files. Consider using the `sysroot` repository rule in @toolchains_llvm//toolchain:sysroot.bzl which provides a single-file (directory) sysroot for more efficient builds.
which suggests using that existing @toolchains_llvm//toolchain:sysroot.bzl rule, but we created our own here?
That rule does exist and also seems to do that "expose a single dir" thing (for the same reason):
https://github.com/bazel-contrib/toolchains_llvm/blob/master/toolchain/sysroot.bzl
There was a problem hiding this comment.
Yes that's what an earlier version of this PR uses. That works fine to resolve the warning and works at build time but it's not enough for the "patching the loader" step which needs access to all the files by label again.
There was a problem hiding this comment.
Here's an alternative to making our own sysroot rule:
The upside is:
- We still use the upstream sysroot rule for the "sysroot" part, it's more complicated than our rule, may change in the future. Unsure how important this is. Less code on our end.
One downside is:
- The sysroot exists 2x on disk (but we only download it once) on machines that build.
WDYT?
There was a problem hiding this comment.
(I also pursued fixing the packaging itself to just do the per-file stuff that we are doing in .bzl now, but it doesn't work as the ["."] approach doesn't even materialize the files, I guess cc_rules has special logic to make this work for them)
There was a problem hiding this comment.
The sysroot exists 2x on disk (but we only download it once) on machines that build.
Yeah I had something like this as well. I don't care/mind. Both solutions are equally convoluted.
Be explicit about zstd compression level. Turns out previously we were using -19 for the amd64 tar but default (-3) for the aarch tar.
Transform absolute symlinks into relative symlinks in our sysroot. Absolute paths break when not putting the sysroot at / which is what bazel does. They would point outside the sandbox and hence break the build. Bump links to the regenerated sysroot.
Fixes: ``` DEBUG: /home/stephan/.cache/bazel/_bazel_stephan/3b3b4dc7de34ad57d2c1628a343bd876/external/toolchains_llvm+/toolchain/internal/system_module_map.bzl:88:18: WARNING: Sysroot @@toolchains_llvm++llvm+current_llvm_toolchain//:sysroot-components-x86_64-linux resolved to 1689 files. Consider using the `sysroot` repository rule in @toolchains_llvm//toolchain:sysroot.bzl which provides a single-file (directory) sysroot for more efficient builds. ``` which appeared after some changes in bazel 9. Stop exposing every single file in the sysroot. Instead expose a sysroot rule which supposedly is better for caching performance etc. We need separate runtime target which used by the packaging process to find and replace the loader in the RP binary.
9649920 to
72a757d
Compare
Bazel 9 logs a warning while configuring the cc_toolchain:
WARNING: Sysroot @@toolchains_llvm++llvm+current_llvm_toolchain//:sysroot-components-x86_64-linux
resolved to 1689 files. Consider using the `sysroot` repository rule
in @toolchains_llvm//toolchain:sysroot.bzl which provides a
single-file (directory) sysroot for more efficient builds.
The current setup downloads each sysroot tarball with `http_archive` and
exposes every file in a `glob(["*/**"])` filegroup. The cc_toolchain
ingests the whole list — 1689 individual files — which is what the
warning is about.
The cc_toolchain only really wants the sysroot as a single
source-directory artifact, which is exactly what
@toolchains_llvm//toolchain:sysroot.bzl produces. But the packaging
rules in //bazel/packaging still need individual file labels for the
glibc dynamic loader (set as the binaries' PT_INTERP via patchelf) and
the versioned shared libraries to ship in install_path/lib. Bazel only
materializes the contents of a `srcs = ["."]` source-directory in the
cc_toolchain's special path, not for arbitrary actions, so a single
filegroup can't serve both consumers.
Use the upstream `sysroot` rule for the cc_toolchain side and pull the
same tarball a second time via `http_archive` with a glob-based BUILD
that exposes the loader and shared libraries as a `:runtime` filegroup
for packaging. The download is sha256-deduped between the two repos,
so the only added cost is one extra extraction per arch on a clean
cache.
The packaging rules now consume `:runtime` via a new `sysroot_runtime`
attribute instead of walking `cc_toolchain.all_files`, since the latter
would only see the single source-directory entry under the upstream
rule. Bump the URLs/sha256 to regenerated tarballs whose absolute
symlinks have been rewritten to relative basename links — bazel rejects
absolute symlinks in directory artifacts and the upstream `sysroot`
rule doesn't rewrite them itself. The Dockerfile change that produces
those regenerated tarballs lives in redpanda-data#30364.
Fixes:
which appeared after some changes in bazel 9.
Stop exposing every single file in the sysroot. Instead expose a sysroot rule which supposedly is better for caching performance etc.
Backports Required
Release Notes