Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,12 @@ jobs:
- name: Build
run: ./build.sh "/opt/osquery-toolchain"

- name: SSH for Debugging
if: failure()
uses: owenthereal/action-upterm@v1
with:
limit-access-to-actor: true # Restrict to the user who triggered the workflow

- name: Archive
run: |
cd /opt/osquery-toolchain/final
Expand Down
53 changes: 27 additions & 26 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
# osquery-toolchain

The script in this repository is used to build the LLVM/Clang toolchain which is used in the osquery project to create portable binaries of it.
The procedure to build such a toolchain has been based on the build-anywhere project: https://github.com/theopolis/build-anywhere

Following the main goals of the toolchain:

- Obtain a LLVM/Clang toolchain which is portable and which doesn't depend from libstdc++ or libgcc.
- The toolchain is compiled with a specific glibc version, so that it runs on a wide range of distributions.
- The toolchain lives in a sysroot folder which should be self sufficient.
- The toolchain should be able to produce binaries that are portable and run on libc >= 2.12.
- The toolchain should be able to produce binaries that are portable and run on libc >= 2.17 on x86_64 and libc >= 2.27 on aarch64.
To do so, the output binary should depend only on shared libraries which are deeply connected with the environment they run on,
typically libc, libdl, librt, libpthread.

The rough steps used to achieve the above goals:

- Use crosstool-ng to compile a stage0 GCC static toolchain, which might be newer than the one available in the system.
- Compile an older libz/zlib which is compatible with the old glibc.
- Link all GCC binaries into the sysroot created by crosstool-ng, so that the sysroot can be used for the next steps
Expand All @@ -21,50 +24,45 @@ The rough steps used to achieve the above goals:
- Use the stage1 Clang to compile a stage1 libunwind (static only)
- Use the stage1 Clang, libunwind, libc++/c++abi, compiler-rt builtins, to build a final/full toolchain

The version of crosstool-ng used is 1.24.0
The version of the GCC compiler built by crosstool-ng is 8.3.0
The version of the libc library built by crosstool-ng is 2.12.2
The version of LLVM/Clang built by the script is 11.0.0
The version of the zlib library built by the script is 1.2.11
The version of crosstool-ng used is 1.28.0
The version of the GCC compiler built by crosstool-ng is 13.4.0
The version of the libc library built by crosstool-ng is 2.17 (x86_64) / 2.27 (aarch64)
The version of LLVM/Clang built by the script is 18.1.8
The version of the zlib library built by the script is 1.3.1

Among other, the toolchain LLVM/Clang includes the clang static analyzer, scan-build, clang-format, clang-tidy.
Among others, the toolchain LLVM/Clang includes the clang static analyzer, scan-build, clang-format, clang-tidy.

# How to build
Using a recent distribution with GCC 8 is suggested to reduce the possible crashes and issues that may happen when compiling
the portable GCC toolchain.
For the instructions we will use Ubuntu 18.04.

For the instructions we will use Ubuntu 22.04.

## Prerequisites

```
sudo apt install g++-8 gcc-8 automake autoconf gettext bison flex unzip help2man libtool-bin libncurses-dev make ninja-build patch txinfo gawk wget git texinfo xz-utils python
```
Then use `update-alternatives` to tell the system that the version of GCC/G++ and CPP is the default we would like to use:
```
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 20
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-8 20
sudo update-alternatives --install /usr/bin/cpp cpp /usr/bin/cpp-8 20
```
Download and install CMake 3.17.5
```
wget https://github.com/Kitware/CMake/releases/download/v3.17.5/cmake-3.17.5-Linux-x86_64.tar.gz
sudo tar xvf cmake-3.17.5-Linux-x86_64.tar.gz -C /usr/local --strip 1
sudo apt install -y g++ gcc automake autoconf gettext bison flex unzip help2man libtool-bin libncurses-dev make ninja-build patch txinfo gawk wget git texinfo xz-utils python3 python3-setuptools bzip2 cmake pkg-config
```

## Customize the configuration

The default configuration is ready to go, though if customization is needed, there are two files that can be modified: config and crosstool-ng-config.
- *config* contains global configuration values like, versions of llvm, zlib, how many parallel jobs to use, which build system to use and such.
- *crosstool-ng-config* contains the configuration that is feed into the crosstool-ng tool, which compiles the portable GCC.
It controls the GCC version built, libc and kernel headers version to build everything.
This is config normally generated by another tool that crosstool-ng provides, but the config can be manually modified after being generated.

- _config_ contains global configuration values like, versions of llvm, zlib, how many parallel jobs to use, which build system to use and such.
- _crosstool-ng-config_ contains the configuration that is feed into the crosstool-ng tool, which compiles the portable GCC.
It controls the GCC version built, libc and kernel headers version to build everything.
This is config normally generated by another tool that crosstool-ng provides, but the config can be manually modified after being generated.

## Build

The script has to be run as a normal user and accepts one argument, which is the folder where the various stages and the final toolchain will be built:

```
./build.sh /opt/osquery-toolchain
```

This should output the sysroot under `/opt/osquery-toolchain/final` and the LLVM toolchain will be under `/opt/osquery-toolchain/final/sysroot/usr`

## Redistributing and usage

1. Enter inside the **final** folder within the destination path
2. Rename the **sysroot** folder to **osquery-toolchain**
3. Compress the folder with the following command: `tar -pcvJf osquery-toolchain-<VERSION>.tar.xz osquery-toolchain`
Expand All @@ -79,16 +77,19 @@ Sometimes explicitly adding `-ldl` and/or `-lrt` is needed, depending on what fu
The only other flag that's needed is `--sysroot=<sysroot path>`, so that the toolchain searches what it needs in the correct path.

## Troubleshooting

If the compilation stops at any point in time, just relaunching the script should restart it.
The script doesn't delete anything when it restarts the build, so if you want to start clean from some substep, you need to do that manually.

So for more advanced troubleshooting, following the build example in general we have:

- `/opt/osquery-toolchain/stage0`: here lives crosstool-ng source code, build, zlib source code, build and GCC/G++ toolchain compiled by crosstool-ng. The GCC/G++ toolchain folder is also copied on the next stages
- `/opt/osquery-toolchain/stage1`: here lives the intermediate LLVM/Clang toolchain, together with a copy of the previous step GCC/G++ toolchain and it's used to build the final toolchain
- `/opt/osquery-toolchain/final`: here lives the final sysroot containing only the LLVM/Clang toolchain we want to use
- `/opt/osquery-toolchain/llvm`: here lives the source code for the LLVM/Clang and the various build folders for the LLVM build substeps

The script decides it has to build one of the stages if it doesn't find a specific file in one of the install folders (stage0, stage1, final).

- For crosstool-ng is `/opt/osquery-toolchain/stage0/crosstool-ng/ct-ng`.
- For GCC is `/opt/osquery-toolchain/stage0/x86_64-osquery-linux-gnu/bin/x86_64-osquery-linux-gnu-gcc`
- For zlib is `/opt/osquery-toolchain/stage0/x86_64-osquery-linux-gnu/x86_64-osquery-linux-gnu/usr/lib/libz.a`
Expand Down
67 changes: 40 additions & 27 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,12 @@
# Version 1.0.0

function build_gcc() {
echo "** Build GCC **"

# Clone and build CrosstoolNG.
if [[ ! -d $CURRENT_DIR/crosstool-ng ]]; then
( cd $CURRENT_DIR; \
git clone https://github.com/crosstool-ng/crosstool-ng -b crosstool-ng-1.24.0 --single-branch )
git clone https://github.com/crosstool-ng/crosstool-ng -b crosstool-ng-1.28.0 --single-branch )
fi

# Use our own config that sets a legacy glibc.
Expand All @@ -34,6 +36,7 @@ function build_gcc() {
}

function prepare_sysroot() {
echo "** Prepare sysroot **"

if [[ ! -e $PREFIX/bin/gcc ]]; then
# Create symlinks in the new sysroot to GCC.
Expand All @@ -43,6 +46,8 @@ function prepare_sysroot() {
}

function build_zlib() {
echo "** Build zlib **"

# Build a legacy zlib and install into the sysroot.
if [[ ! -d $CURRENT_DIR/zlib-${ZLIB_VER} ]]; then
( cd $CURRENT_DIR; \
Expand All @@ -64,6 +69,7 @@ function build_zlib() {
}

function build_llvm() {
echo "** Build LLVM **"

if [[ ! -e ${install_dir}/bin/clang ]]; then

Expand Down Expand Up @@ -105,6 +111,7 @@ function build_llvm() {
}

function build_compiler-rt-builtins() {
echo "** Build compiler-rt builtins **"

if [[ ! -e ${install_dir}/lib/linux/libclang_rt.builtins-$MACHINE.a ]]; then

Expand Down Expand Up @@ -145,6 +152,7 @@ function build_compiler-rt-builtins() {

#-DLIBCXX_ENABLE_STATIC_ABI_LIBRARY=ON \
function build_compiler_libs() {
echo "** Build compiler runtime libraries **"

if [[ ! -e ${install_dir}/lib/libc++.a ]]; then
( cd $LLVM_SRC && \
Expand All @@ -159,19 +167,7 @@ function build_compiler_libs() {
-DCMAKE_EXE_LINKER_FLAGS="-Wl,--strip-all ${additional_linker_flags}" \
-DCMAKE_SHARED_LINKER_FLAGS="-Wl,--strip-all ${additional_linker_flags}" \
-DCMAKE_SYSROOT="${SYSROOT}" \
-DLLVM_REQUIRES_RTTI=ON \
-DLLVM_TARGETS_TO_BUILD=${targets_to_build} \
-DLLVM_ENABLE_PROJECTS="${llvm_projects}" \
-DLLVM_BUILD_LLVM_DYLIB=ON \
-DLLVM_LINK_LLVM_DYLIB=ON \
-DLLVM_ENABLE_EH=ON \
-DLLVM_ENABLE_RTTI=ON \
-DLLVM_INCLUDE_DOCS=OFF \
-DLLVM_INCLUDE_TESTS=OFF \
-DLLVM_INCLUDE_EXAMPLES=OFF \
-DLLVM_ENABLE_LIBXML2=OFF \
-DLLVM_ENABLE_PIC=ON \
-DLLVM_DEFAULT_TARGET_TRIPLE=${TUPLE} \
-DLLVM_ENABLE_RUNTIMES="${llvm_projects}" \
-DLIBCXXABI_USE_LLVM_UNWINDER=ON \
-DLIBCXXABI_ENABLE_STATIC_UNWINDER=ON \
-DLIBCXXABI_USE_COMPILER_RT=ON \
Expand All @@ -185,8 +181,9 @@ function build_compiler_libs() {
-DLIBUNWIND_USE_COMPILER_RT=ON \
-DLIBUNWIND_ENABLE_STATIC=ON \
-DLIBUNWIND_ENABLE_SHARED=OFF \
-DCMAKE_POSITION_INDEPENDENT_CODE=ON \
${additional_cmake} \
../llvm && \
../runtimes && \
cmake --build . --target cxx -j ${PARALLEL_JOBS} && \
cmake --build . --target install-cxx -j ${PARALLEL_JOBS} && \
cmake --build . --target install-cxxabi -j ${PARALLEL_JOBS} && \
Expand Down Expand Up @@ -258,7 +255,6 @@ mkdir -p $CURRENT_DIR
SYSROOT=$CURRENT_DIR/$TUPLE/$TUPLE/sysroot
PREFIX=$SYSROOT/usr


build_gcc
prepare_sysroot

Expand All @@ -275,6 +271,17 @@ if [[ ! -d $TOOLCHAIN_DIR/final ]]; then
cp -r $CURRENT_DIR/$TUPLE $TOOLCHAIN_DIR/final/
fi

# Fix pkg-config .pc files copied from stage0. They contain hardcoded
# stage0 absolute paths; without this fix, pkg-config returns -I flags
# pointing into the stage0 sysroot, which take priority over --sysroot
# and cause later builds (libbpf, bpftool) to pick up stage0 headers.
STAGE0_SYSROOT_PREFIX=$CURRENT_DIR/$TUPLE/$TUPLE/sysroot/usr
for dest in stage1 final; do
DEST_PREFIX=$TOOLCHAIN_DIR/$dest/$TUPLE/$TUPLE/sysroot/usr
find $DEST_PREFIX/lib/pkgconfig -name '*.pc' -exec \
sed -i "s|${STAGE0_SYSROOT_PREFIX}|${DEST_PREFIX}|g" {} + 2>/dev/null || true
done

STAGE1_SYSROOT=$TOOLCHAIN_DIR/stage1/$TUPLE/$TUPLE/sysroot
if [[ ! -e $STAGE1_SYSROOT/usr/lib/gcc ]]; then
( cd $STAGE1_SYSROOT/usr/lib; \
Expand Down Expand Up @@ -313,6 +320,11 @@ if [[ ! -d ${LLVM_SRC} ]]; then
git clone https://github.com/llvm/llvm-project.git llvm -b llvmorg-$LLVM_VERSION --single-branch --depth 1
fi

# Patch LLVM SmallVector.h to add missing #include <cstdint> required by newer GCC.
if ! grep -q '#include <cstdint>' ${LLVM_SRC}/llvm/include/llvm/ADT/SmallVector.h; then
sed -i '/#include <algorithm>/a #include <cstdint>' ${LLVM_SRC}/llvm/include/llvm/ADT/SmallVector.h
fi

LLVM_DISABLED_TOOLS="-DLLVM_TOOL_BUGPOINT_BUILD=OFF"
LLVM_DISABLED_TOOLS="${LLVM_DISABLED_TOOLS} -DLLVM_TOOL_BUGPOINT_PASSES_BUILD=OFF"
LLVM_DISABLED_TOOLS="${LLVM_DISABLED_TOOLS} -DLLVM_TOOL_DSYMUTIL_BUILD=OFF"
Expand All @@ -326,15 +338,15 @@ cxx_compiler="g++" \
install_dir="$PREFIX" \
llvm_projects='clang;lld' \
targets_to_build="$LLVM_MACHINE" \
additional_linker_flags="" \
additional_compiler_flags="-s" \
additional_cmake="" \
additional_linker_flags="-static-libstdc++ -static-libgcc" \
additional_compiler_flags="" \
additional_cmake="-DLLVM_BUILD_LLVM_DYLIB=OFF -DLLVM_LINK_LLVM_DYLIB=OFF" \
build_llvm

build_folder="build-compilerrt-builtins" \
cc_compiler="clang" \
cxx_compiler="clang++" \
install_dir="$PREFIX/lib/clang/$LLVM_VERSION" \
install_dir="$PREFIX/lib/clang/${LLVM_VERSION%%.*}" \
additional_linker_flags="" \
additional_cmake="" \
build_compiler-rt-builtins
Expand All @@ -344,19 +356,17 @@ cc_compiler="clang" \
cxx_compiler="clang++" \
install_dir="$PREFIX" \
llvm_projects='libcxx;libcxxabi;libunwind' \
targets_to_build="$LLVM_MACHINE;BPF" \
additional_linker_flags="" \
additional_cmake="" \
additional_cmake="-DCMAKE_CXX_STANDARD=20" \
build_compiler_libs

build_folder="build-libcxx" \
cc_compiler="clang" \
cxx_compiler="clang++" \
install_dir="$TOOLCHAIN_DIR/final/$TUPLE/$TUPLE/sysroot/usr" \
llvm_projects='libcxx;libcxxabi;libunwind' \
targets_to_build="$LLVM_MACHINE;BPF" \
additional_linker_flags="" \
additional_cmake="" \
additional_cmake="-DCMAKE_CXX_STANDARD=20" \
build_compiler_libs

# Remove the static libclang/liblld from the sysroot
Expand All @@ -373,9 +383,10 @@ llvm_additional_cmake="-DCOMPILER_RT_INSTALL_PATH=${PREFIX}"
llvm_additional_cmake="${llvm_additional_cmake} -DCLANG_DEFAULT_CXX_STDLIB=libc++"
llvm_additional_cmake="${llvm_additional_cmake} -DCLANG_DEFAULT_LINKER=lld"
llvm_additional_cmake="${llvm_additional_cmake} -DCLANG_DEFAULT_RTLIB=compiler-rt"
llvm_additional_cmake="${llvm_additional_cmake} -DLLVM_USE_LINKER=lld"
llvm_additional_cmake="${llvm_additional_cmake} -DLLVM_ENABLE_LIBCXX=ON"
llvm_additional_cmake="${llvm_additional_cmake} -DLLVM_USE_LINKER=lld"
llvm_additional_cmake="${llvm_additional_cmake} -DCOMPILER_RT_USE_BUILTINS_LIBRARY=ON"
llvm_additional_cmake="${llvm_additional_cmake} -DCMAKE_CXX_STANDARD=20"

build_folder="build-llvm-final" \
cc_compiler="clang" \
Expand All @@ -400,7 +411,8 @@ PREFIX=$SYSROOT/usr
( cd $PREFIX/bin; \
rm -f gcc; \
rm -f g++; \
rm -f gcc-${GCC_VERSION}; \
rm -f gcc-*; \
rm -f lto-dump; \
rm -f c++; \
rm -f cc; \
rm -f ld; \
Expand All @@ -413,7 +425,8 @@ PREFIX=$SYSROOT/usr
symlinks_to_transform=(
lib/gcc
bin/addr2line
bin/ar bin/as
bin/ar
bin/as
bin/c++filt
bin/cpp
bin/elfedit
Expand Down
8 changes: 4 additions & 4 deletions config
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,12 @@
# The vendor part has to be changed also in the crosstool-ng-config file
TUPLE=$MACHINE-osquery-linux-gnu

ZLIB_VER="1.2.13"
ZLIB_VER="1.3.1"
ZLIB_URL="https://zlib.net/fossils/zlib-${ZLIB_VER}.tar.gz"
ZLIB_SHA="b3a24de97a8fdbc835b9833169501030b8977031bcb54b3b3ac13740f846ab30"
ZLIB_SHA="9a93b2b7dfdac77ceba5a558a580e74667dd6fede4585b91eefb60f03b72df23"

LLVM_VERSION="11.0.0"
GCC_VERSION="8.3.0" # Notice: this has to match the same version that has been configured in crosstool-ng-config
LLVM_VERSION="18.1.8"
GCC_VERSION="13.4.0" # Notice: this has to match the same version that has been configured in crosstool-ng-config

PARALLEL_JOBS=$(( $(nproc)+1 ))
BUILD_GENERATOR="Ninja"
Expand Down
Loading