Skip to content

bazel: silence merkle_cache sysroot warning, ship loader/libs separately#30449

Open
travisdowns wants to merge 3 commits into
redpanda-data:devfrom
travisdowns:td-alt-sysroot-debug-fix
Open

bazel: silence merkle_cache sysroot warning, ship loader/libs separately#30449
travisdowns wants to merge 3 commits into
redpanda-data:devfrom
travisdowns:td-alt-sysroot-debug-fix

Conversation

@travisdowns
Copy link
Copy Markdown
Member

@travisdowns travisdowns commented May 12, 2026

Bazel 9 logs a warning while configuring the cc_toolchain because the sysroot tarballs are exposed as ~1700 individual files. The upstream sysroot rule from toolchains_llvm solves this by exposing the sysroot as a single source-directory artifact, but our packaging rules still need individual file labels for the dynamic loader (set as PT_INTERP via patchelf) and the versioned shared libraries we ship in install_path/lib. A single filegroup can't serve both consumers because bazel only materializes a srcs = ["."] source-directory in the cc_toolchain's special path, not in arbitrary actions.

The commit switches the cc_toolchain side to the upstream sysroot rule and adds a parallel http_archive that pulls the same tarball with a glob-based BUILD exposing a :runtime filegroup for packaging. The download is sha256-deduped between the two repos, so the only added cost is one extra extraction per arch on a clean cache.

The packaging rules now consume :runtime via a new sysroot_runtime attribute instead of walking cc_toolchain.all_files, since the latter would only see the single source-directory entry under the upstream rule. The sysroot URLs/sha256 are bumped to regenerated tarballs whose absolute symlinks have been rewritten to relative basename links — bazel rejects absolute symlinks in directory artifacts and the upstream sysroot rule doesn't rewrite them itself. The Dockerfile change that produces those regenerated tarballs lives in #30364.

This is an alternative to the last commit of #30364. End-to-end verified: //bazel/packaging:redpanda_tar builds and ships opt/redpanda/lib/{ld-linux-x86-64.so.2, libc.so.6, libpthread.so.0, libresolv.so.2, ...} — same set of files as before.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v25.3.x
  • v25.2.x
  • v25.1.x

Release Notes

  • none

Copilot AI review requested due to automatic review settings May 12, 2026 18:22
@travisdowns travisdowns mentioned this pull request May 12, 2026
7 tasks
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Bazel sysroot handling to eliminate Bazel 9’s “sysroot resolved to N files” warning by switching the cc_toolchain to the upstream toolchains_llvm sysroot repository rule (single directory artifact), while introducing a parallel “runtime” sysroot archive that still exposes individual shared-library/loader files needed by Redpanda packaging.

Changes:

  • Switch sysroot repositories used by the toolchain to @toolchains_llvm//toolchain:sysroot.bzl%sysroot and update toolchain sysroot labels accordingly.
  • Add *_sysroot_runtime http_archive repos exposing a :runtime filegroup for packaging consumption.
  • Update packaging rules and targets to consume sysroot runtime artifacts via a new sysroot_runtime attribute instead of walking cc_toolchain.all_files.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
MODULE.bazel.lock Updates module extension lock metadata and switches sysroot repos to the upstream sysroot rule; adds runtime sysroot repos and bumps sysroot tarball URLs/sha256.
MODULE.bazel Adds use_repo entries for *_sysroot_runtime and updates llvm.sysroot(...) labels to the upstream sysroot target location.
bazel/repositories.bzl Defines sysroot repos via the upstream sysroot rule and adds parallel http_archive repos exposing a runtime filegroup for packaging.
bazel/packaging/packaging.bzl Reworks sysroot library/loader collection to use sysroot_runtime attribute; removes C++ toolchain dependency from packaging rules.
bazel/packaging/BUILD Adds a sysroot_runtime alias selected by CPU and wires it into packaging targets that ship sysroot libs.

Comment on lines +56 to +59
fail("include_sysroot_libs is True but sysroot_runtime is not set on", ctx.attr.name)
loaders = [f for f in ctx.files.sysroot_runtime if f.basename.startswith("ld-linux-")]
if len(loaders) != 1:
fail("expected exactly one ld-linux loader in sysroot_runtime, got", [f.basename for f in loaders])
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified empirically — fail() in Starlark accepts variadic positional args joined by sep (default " "); the second positional is not treated as attr. From the Bazel docs, the real signature is fail(msg=None, attr=None, sep=" ", stack_trace=True, *args) where msg/attr are documented as deprecated keyword-only forms.

Self-contained test:

def _impl(ctx):
    fail("first message", ctx.attr.name, ["a", "list", 42])
fail_rule = rule(implementation = _impl)

produces:

Error in fail: first message x ["a", "list", 42]

The codebase already uses this idiom elsewhere (e.g. fail("fips_moduleandfips_config must both be specified in", ctx.attr.name) in bazel/test.bzl). Leaving as-is.

@travisdowns travisdowns force-pushed the td-alt-sysroot-debug-fix branch from d05f395 to 66ec7a3 Compare May 13, 2026 19:08
@travisdowns
Copy link
Copy Markdown
Member Author

Last push (66ec7a3) is a pure rebase onto latest dev — no code changes.

StephanDollberg and others added 3 commits May 13, 2026 15:10
Be explicit about zstd compression level.

Turns out previously we were using -19 for the amd64 tar but default
(-3) for the aarch tar.
Transform absolute symlinks into relative symlinks in our sysroot.
Absolute paths break when not putting the sysroot at / which is what
bazel does. They would point outside the sandbox and hence break the
build.

Bump links to the regenerated sysroot.
Bazel 9 logs a warning while configuring the cc_toolchain:

    WARNING: Sysroot @@toolchains_llvm++llvm+current_llvm_toolchain//:sysroot-components-x86_64-linux
    resolved to 1689 files. Consider using the `sysroot` repository rule
    in @toolchains_llvm//toolchain:sysroot.bzl which provides a
    single-file (directory) sysroot for more efficient builds.

The current setup downloads each sysroot tarball with `http_archive` and
exposes every file in a `glob(["*/**"])` filegroup. The cc_toolchain
ingests the whole list — 1689 individual files — which is what the
warning is about.

The cc_toolchain only really wants the sysroot as a single
source-directory artifact, which is exactly what
@toolchains_llvm//toolchain:sysroot.bzl produces. But the packaging
rules in //bazel/packaging still need individual file labels for the
glibc dynamic loader (set as the binaries' PT_INTERP via patchelf) and
the versioned shared libraries to ship in install_path/lib. Bazel only
materializes the contents of a `srcs = ["."]` source-directory in the
cc_toolchain's special path, not for arbitrary actions, so a single
filegroup can't serve both consumers.

Use the upstream `sysroot` rule for the cc_toolchain side and pull the
same tarball a second time via `http_archive` with a glob-based BUILD
that exposes the loader and shared libraries as a `:runtime` filegroup
for packaging. The download is sha256-deduped between the two repos,
so the only added cost is one extra extraction per arch on a clean
cache.

The packaging rules now consume `:runtime` via a new `sysroot_runtime`
attribute instead of walking `cc_toolchain.all_files`, since the latter
would only see the single source-directory entry under the upstream
rule. Bump the URLs/sha256 to regenerated tarballs whose absolute
symlinks have been rewritten to relative basename links — bazel rejects
absolute symlinks in directory artifacts and the upstream `sysroot`
rule doesn't rewrite them itself. The Dockerfile change that produces
those regenerated tarballs lives in redpanda-data#30364.
@travisdowns travisdowns force-pushed the td-alt-sysroot-debug-fix branch from 66ec7a3 to b555892 Compare May 13, 2026 19:12
@travisdowns
Copy link
Copy Markdown
Member Author

Cherry-picked the first two commits from #30364 (Stephan's PR) onto this branch as discussed — 21b0dd6 (zstd -19 in readme) and d8dcced (strip absolute symlinks from sysroot). My commit is now b555892 on top.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants