Skip to content

Pin clang to version 17 in sanitizer CI jobs#3546

Merged
hpatro merged 1 commit intovalkey-io:unstablefrom
hanxizh9910:fix-clang-lto-linker-mismatch
Apr 23, 2026
Merged

Pin clang to version 17 in sanitizer CI jobs#3546
hpatro merged 1 commit intovalkey-io:unstablefrom
hanxizh9910:fix-clang-lto-linker-mismatch

Conversation

@hanxizh9910
Copy link
Copy Markdown
Contributor

@hanxizh9910 hanxizh9910 commented Apr 22, 2026

Analysis

The daily CI sanitizer jobs with clang are failing during the build step.

The ubuntu-latest runner now has clang 18, but the LLVM gold plugin
is still version 17. When the static Lua module is built with -flto,
the .o files contain LLVM 18 bitcode that the gold plugin (v17) cannot
read:
bfd plugin: LLVM gold plugin has failed to create LTO module: Unknown attribute kind (91) (Producer: 'LLVM18.1.3' Reader: 'LLVM 17.0.6')
Example failure: https://github.com/valkey-io/valkey/actions/runs/24753491944/job/72421581512

Fix

Pin the sanitizer jobs to clang-17 so the compiler and gold plugin
versions match.
Tested(successfully built): https://github.com/hanxizh9910/valkey/actions/runs/24859845008

Note

If clang-17 is removed from the ubuntu-latest image in the future,
we may need to either add an explicit install step

@Nikhil-Manglore Nikhil-Manglore added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Apr 22, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 22, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 76.37%. Comparing base (03c2d4c) to head (bde9dbc).
⚠️ Report is 7 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3546      +/-   ##
============================================
+ Coverage     76.21%   76.37%   +0.16%     
============================================
  Files           159      159              
  Lines         81672    80045    -1627     
============================================
- Hits          62243    61132    -1111     
+ Misses        19429    18913     -516     

see 126 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Nikhil-Manglore
Copy link
Copy Markdown
Member

Nikhil-Manglore commented Apr 22, 2026

I realized we don't run the sanitizer tests for the run-extra-tests label, but it looks like the local tests Hanxi ran passed

Copy link
Copy Markdown
Contributor

@sarthakaggarwal97 sarthakaggarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work fine. Another option is trying out if we can pin it to clang-17 for the sanitizer jobs. That is probably less invasive than a makefile change since it only changes daily.yml file

something like this?

      matrix:
        compiler: [gcc, clang-17]

@hpatro
Copy link
Copy Markdown
Contributor

hpatro commented Apr 23, 2026

@eifrah-aws Could you also take a look?

Comment thread src/Makefile Outdated
Comment on lines +737 to +740
# Strip LTO flags from OPTIMIZATION for the static Lua module build.
# When archiving into a .a, LTO bitcode objects cause linker failures
# if the system's LLVM gold plugin version doesn't match clang's version.
LUA_MODULE_OPTIMIZATION=$(subst -flto,,$(subst -ffat-lto-objects,,$(subst -flto=auto,,$(OPTIMIZATION))))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this degrade performance?

@hanxizh9910
Copy link
Copy Markdown
Contributor Author

Related commit: 0327c27

Signed-off-by: Hanxi Zhang <hanxizh@amazon.com>
@hanxizh9910 hanxizh9910 force-pushed the fix-clang-lto-linker-mismatch branch from bc80832 to bde9dbc Compare April 23, 2026 21:28
@hanxizh9910 hanxizh9910 changed the title Fix clang LTO build failure due to LLVM version mismatch Pin clang to version 17 in sanitizer CI jobs Apr 23, 2026
@hpatro
Copy link
Copy Markdown
Contributor

hpatro commented Apr 23, 2026

@hanxizh9910 curious why did we choose clang 17 instead of 18?

@roshkhatri
Copy link
Copy Markdown
Member

I think we can remove the LTO flag and it shouldn't affect the performance as the Lua module is compiled into a static archive and then it's linked into valkey-server using --whole-archive. So we are anyway taking every object in this lua archive and including it directly. LTO is optimizing across file boundaries at link time, but the module and valkey has a good boundary and are interacting with the valkey server with limited calls. So there's very little for LTO to optimize across that seam. The hot path is inside the Lua interpreter itself, and that still gets full -O2/-O3 optimization. I think there will be no meaningful performance impact.

@hanxizh9910
Copy link
Copy Markdown
Contributor Author

@hanxizh9910 curious why did we choose clang 17 instead of 18?

Hi, because the LLVM gold plugin on the runner is version 17. We need the compiler to match the gold plugin. Pinning to clang-18 wouldn't help

@roshkhatri
Copy link
Copy Markdown
Member

roshkhatri commented Apr 23, 2026

@hanxizh9910 curious why did we choose clang 17 instead of 18?

The reader, bfd plugin shipped with ubuntu latest is still on 17 and does not support reading LLVM 18 bitcode
Either we upgrade bfd plugin on the runners while setting up env. which is extra effort.
Or just select clang 17 for now

@sarthakaggarwal97
Copy link
Copy Markdown
Contributor

FYI: The current tests failures in CI should get fixes with #3545

@hpatro
Copy link
Copy Markdown
Contributor

hpatro commented Apr 23, 2026

Started failing after #3392 merge. Merging it for now to help stabilize the daily. However, take a look @eifrah-aws once you get time.

@hpatro hpatro merged commit 7db5b70 into valkey-io:unstable Apr 23, 2026
63 of 66 checks passed
@hanxizh9910
Copy link
Copy Markdown
Contributor Author

hanxizh9910 commented Apr 24, 2026

failed to make again: https://github.com/valkey-io/valkey/actions/runs/24865821147/job/72801509679. Will make a new PR using the old way to fix it

hpatro added a commit that referenced this pull request Apr 24, 2026
This reverts commit 7db5b70.

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
hpatro added a commit that referenced this pull request Apr 24, 2026
Reverts #3546

This didn't help fix the build issue. Follow up PR is performed on
#3555

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants