Skip to content

[Bug] Build image cbdb-build-rocky9-latest: dnf fails on Rocky minor-version transitions due to pinned 9.7 baseurls #1709

@talmacschen-arch

Description

@talmacschen-arch

Apache Cloudberry version

cloudberry 2.1.x /2.0.x

What happened

Summary

The CI build image apache/incubator-cloudberry:cbdb-build-rocky9-latest
ships dnf repository files with baseurl paths pinned to a specific
Rocky Linux minor version (currently 9.7). Whenever Rocky publishes
the next point release (now 9.8), upstream mirrors stop serving the
old minor — the previous tree is moved to vault.rockylinux.org. From
that moment, every dnf install inside the image fails with hundreds of
404s on *-modules.yaml.xz, *-primary.xml.gz, *-filelists.xml.gz,
etc. This breaks any downstream CI that does runtime package installs
on top of this image, including apache/cloudberry-backup.

What you think should happen instead

Root cause

/etc/yum.repos.d/rocky*.repo (or the mirrorlist= URLs they generate)
embed the minor version 9.7 instead of the rolling major 9 (or
$releasever mapped to 9). Mirror network behavior on Rocky point
releases has been consistent for years: only the current minor is on
live mirrors; older minors move to vault.rockylinux.org.

Proposed fixes (any one is sufficient)

  1. Rolling major in baseurl (preferred). Change repo baseurl
    from .../rocky/9.7/... to .../rocky/9/... (or rely on the
    mirrorlist= form that Rocky's official rocky.repo uses, which
    resolves the current point release automatically). This makes the
    image resilient across future 9.x → 9.y transitions without any
    maintenance.
  2. Rebuild image on Rocky 9.8 base. Short-term unblock, but the
    same problem will recur at 9.9.
  3. Add vault.rockylinux.org as a fallback baseurl. Keeps the
    pinned-minor model working after the transition. Slower than live
    mirrors but always available.

In addition, it would help downstream CIs if commonly required locale
packages were pre-installed in the image (e.g.
glibc-langpack-de, glibc-langpack-en), so that a transient mirror
outage upstream doesn't cascade into every Cloudberry-related test job.

Impact

Until the image is republished, every CI run against
apache/cloudberry-backup (and likely any other repo using this image
plus a runtime dnf install) will fail at the package-install step,
regardless of code changes in the PR. We are currently working around
it by rewriting the repo files to point at vault.rockylinux.org in
the workflow itself, but that is brittle and not appropriate as a
long-term fix.

Happy to send a PR against whichever Dockerfile produces this image if
the maintainers can point me at it.

How to reproduce

Reproduction

Inside a fresh container of apache/incubator-cloudberry:cbdb-build-rocky9-latest:

dnf install -y glibc-langpack-de                                                                                                                                      
                                                          
fails with:

Errors during downloading metadata for repository 'appstream':
  - Status code: 404 for http://.../rocky/9.7/AppStream/x86_64/os/repodata/...                                                                                        
Error: Failed to download metadata for repo 'appstream'                                                                                                               
                                                                                                                                                                      
(All ~40 mirrors in the mirrorlist 404 because none of them carry 9.7/                                                                                                
on the live tree anymore.)                                                                                                                                            
                                                                                                                                                                      
Real CI failure for reference:                            
- Repo: apache/cloudberry-backup                                                                                                                                      
- Workflow: cloudberry-backup-ci.yml, step Setup Locale for Integration Tests                                                                                         
- Symptom: dnf install -y glibc-langpack-de fails with the 404 storm above 

### Operating System

rockyLinux 9.x



### Code of Conduct

- [x] I agree to follow this project's [Code of Conduct](https://github.com/apache/cloudberry/blob/main/CODE_OF_CONDUCT.md).

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: BugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions