Skip to content
26 changes: 25 additions & 1 deletion .github/workflows/cpp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,14 @@ jobs:
runs-on: ubuntu-24.04-arm
title: ARM64 Ubuntu 22.04 C++
ubuntu: 22.04
- arch: riscv64
clang-tools: 18
image: ubuntu-cpp
llvm: 18
runs-on: ubuntu-24.04-riscv
title: RISCV64 Ubuntu 24.04 C++
ubuntu: 24.04
Comment on lines +105 to +112
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new riscv64 job will likely run with reduced filesystem test coverage: on riscv64, ci/scripts/install_minio.sh and ci/scripts/install_gcs_testbench.sh currently skip installation for unsupported architectures, and ci/scripts/cpp_test.sh then excludes arrow-s3fs-test / arrow-gcsfs-test when minio / storage-testbench aren’t present. If the intent is a “regular” C++ CI job, consider adding riscv64 support for these dependencies (or making the reduced coverage explicit in the workflow/job config).

Suggested change
- arch: riscv64
clang-tools: 18
image: ubuntu-cpp
llvm: 18
runs-on: ubuntu-24.04-riscv
title: RISCV64 Ubuntu 24.04 C++
ubuntu: 24.04
# NOTE: On riscv64, MinIO and the GCS storage testbench are not installed,
# so S3/GCS filesystem tests (arrow-s3fs-test / arrow-gcsfs-test) run with
# reduced coverage or are skipped. This job is intentionally marked as such.
- arch: riscv64
clang-tools: 18
image: ubuntu-cpp
llvm: 18
runs-on: ubuntu-24.04-riscv
title: RISCV64 Ubuntu 24.04 C++ (reduced filesystem tests)
ubuntu: 24.04
filesystem-tests: reduced

Copilot uses AI. Check for mistakes.
buildkit: "0"
env:
ARCH: ${{ matrix.arch }}
ARROW_SIMD_LEVEL: ${{ matrix.simd-level }}
Expand All @@ -128,17 +136,33 @@ jobs:
python-version: 3
- name: Setup Python on self-hosted runner
if: |
contains(matrix.runs-on, 'self-hosted')
contains(matrix.runs-on, 'self-hosted') ||
matrix.runs-on == 'ubuntu-24.04-riscv'
run: |
sudo apt update
sudo apt install -y --no-install-recommends python3 python3-dev python3-pip
python3 -m pip install -U pip
echo "$HOME/.local/bin" >> $GITHUB_PATH
Comment on lines 143 to +146
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Read the relevant section of the cpp.yml workflow file
sed -n '125,165p' .github/workflows/cpp.yml

Repository: gounthar/arrow

Length of output: 1919


🏁 Script executed:

# Check if there are any other references to Ubuntu 24.04 or Python management in the repo
rg -i "ubuntu-24|externally.managed|pep.668" --type yaml --type markdown

Repository: gounthar/arrow

Length of output: 1042


🌐 Web query:

Ubuntu 24.04 Python externally managed PEP 668

💡 Result:

In Ubuntu 24.04 LTS (Noble Numbat), the system Python 3.12 installation is marked as "externally managed" per PEP 668. This prevents pip from installing packages into the system Python environment to avoid conflicts with the apt package manager. The marker file EXTERNALLY-MANAGED is located at /usr/lib/python3.12/EXTERNALLY-MANAGED. When pip detects this file, it blocks installations (system-wide or --user) and displays an error recommending alternatives. Recommended ways to install Python packages: 1. Use apt for available packages: sudo apt install python3- 2. Use virtual environments: sudo apt install python3-full python3-venv; python3 -m venv /path/to/venv; source /path/to/venv/bin/activate; pip install 3. For standalone apps: sudo apt install pipx; pipx install 4. Override (not recommended): pip install --break-system-packages See /usr/share/doc/python3.12/README.venv for Ubuntu-specific venv guidance. This behavior started in Ubuntu 23.04 and continues in 24.04.

Citations:


🏁 Script executed:

# Check if there are any discussions or issues about this Python setup in the codebase
rg -i "break-system-packages|venv|virtualenv|pip install" .github/workflows/ -A 2 -B 2 | head -80

Repository: gounthar/arrow

Length of output: 4234


🏁 Script executed:

# Check if there are any related GitHub runner or CI discussions in comments
rg -i "riscv|ubuntu-24|pip|venv" .github/workflows/cpp.yml -B 3 -A 3 | grep -A 5 -B 5 "riscv64"

Repository: gounthar/arrow

Length of output: 1204


Use a venv for the RISC-V Python bootstrap to avoid PEP 668 externally-managed Python restrictions.

Ubuntu 24.04 marks the system Python as externally managed (per PEP 668), which blocks pip from installing to the system environment. The current code will fail at line 144 (python3 -m pip install -U pip) and again at the "Setup Archery" step that follows. Create a job-local virtualenv and add its bin directory to GITHUB_PATH instead.

Suggested fix
-          sudo apt install -y --no-install-recommends python3 python3-dev python3-pip
-          python3 -m pip install -U pip
-          echo "$HOME/.local/bin" >> $GITHUB_PATH
+          sudo apt install -y --no-install-recommends python3 python3-dev python3-pip python3-venv
+          python3 -m venv .venv
+          echo "$GITHUB_WORKSPACE/.venv/bin" >> "$GITHUB_PATH"
+          . .venv/bin/activate
+          python -m pip install -U pip
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpp.yml around lines 142 - 145, The workflow currently
calls "python3 -m pip install -U pip" and adds "$HOME/.local/bin" to
GITHUB_PATH, which will fail under PEP 668; instead create a job-local virtual
environment (use python3 -m venv <envdir>), install/upgrade pip into that venv,
and add the venv's bin directory to GITHUB_PATH so subsequent steps (e.g.,
"Setup Archery") use the venv; update the steps surrounding the existing
"python3 -m pip install -U pip" and the "$HOME/.local/bin" echo to reference the
venv directory you create.

- name: Setup Docker on RISE runner
if: matrix.runs-on == 'ubuntu-24.04-riscv'
run: |
# Install Docker Compose plugin (not in Ubuntu riscv64 repos)
COMPOSE_VERSION=$(curl -s https://api.github.com/repos/docker/compose/releases/latest | grep -oP '"tag_name": "\K[^"]+')
sudo mkdir -p /usr/local/lib/docker/cli-plugins
sudo curl -SL "https://github.com/docker/compose/releases/download/${COMPOSE_VERSION}/docker-compose-linux-riscv64" \
-o /usr/local/lib/docker/cli-plugins/docker-compose
sudo chmod +x /usr/local/lib/docker/cli-plugins/docker-compose
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Read the workflow file to understand the context
cat -n .github/workflows/cpp.yml

Repository: gounthar/arrow

Length of output: 17410


Don't install a floating Compose binary into /usr/local from the workflow.

This mutates shared host state on the self-hosted runner and resolves releases/latest on every run. On a reused runner, later jobs inherit whichever Compose version last ran. Please either provision Compose on the runner image, or install a pinned, checksum-verified version into a job-scoped location and clean it up after the job.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/cpp.yml around lines 149 - 154, The workflow installs an
unpinned Compose binary into /usr/local/lib/docker/cli-plugins using
COMPOSE_VERSION resolved from releases/latest which mutates the self-hosted
runner; instead, pin a specific Compose release (avoid using releases/latest),
verify the binary checksum before use, install it into a job-scoped directory
(e.g. $RUNNER_TEMP or another ephemeral path) rather than
/usr/local/lib/docker/cli-plugins, and remove the binary at the end of the job;
update the steps that set COMPOSE_VERSION and the target path
(/usr/local/lib/docker/cli-plugins/docker-compose) to use the pinned version,
checksum verification, ephemeral install location, and a cleanup step, or
alternatively provision Compose into the runner image and remove the install
logic entirely.

docker compose version
# Debug Docker setup
docker info 2>&1 | head -30
docker buildx ls 2>&1
- name: Setup Archery
run: python3 -m pip install -e dev/archery[docker]
- name: Execute Docker Build
env:
ARCHERY_DOCKER_USER: ${{ secrets.DOCKERHUB_USER }}
ARCHERY_DOCKER_PASSWORD: ${{ secrets.DOCKERHUB_TOKEN }}
DOCKER_BUILDKIT: ${{ matrix.buildkit || '1' }}
run: |
# GH-40558: reduce ASLR to avoid ASAN/LSAN crashes
sudo sysctl -w vm.mmap_rnd_bits=28
Expand Down
Loading