Skip to content

fold 'kola testiso' into 'kola run'#4377

Open
nikita-dubrovskii wants to merge 27 commits intocoreos:mainfrom
nikita-dubrovskii:fold_testiso
Open

fold 'kola testiso' into 'kola run'#4377
nikita-dubrovskii wants to merge 27 commits intocoreos:mainfrom
nikita-dubrovskii:fold_testiso

Conversation

@nikita-dubrovskii
Copy link
Copy Markdown
Contributor

This huge PR folds all testiso tests:

$ kola run iso.* --parallel 4 --inst-insecure                                
=== RUN   iso.as-disk.uefi                          
=== RUN   iso.miniso-install.nm                   
=== RUN   iso.miniso-install.4k.uefi                                                                                                                                                                               
=== RUN   iso.pxe-offline-install.rootfs-appended
=== RUN   iso.live-login.uefi                                                                            
=== RUN   iso.offline-install-iscsi.ibft.uefi                                                                                                                                                                      
=== RUN   iso.as-disk.uefi-secure                                                                        
=== RUN   iso.fips.uefi                                                                                  
=== RUN   iso.pxe-online-install               
=== RUN   iso.live-login                                                                                 
=== RUN   iso.miniso-install.4k.nm.uefi           
=== RUN   iso.install                        
=== RUN   iso.offline-install-fromram.4k.uefi
=== RUN   iso.offline-install-iscsi.ibft-with-mpath
=== RUN   iso.offline-install-iscsi.manual
=== RUN   iso.miniso-install
=== RUN   iso.offline-install
=== RUN   iso.pxe-online-install.4k.uefi
=== RUN   iso.live-login.uefi-secure
=== RUN   iso.as-disk
=== RUN   iso.pxe-offline-install.4k.uefi
=== RUN   iso.offline-install.mpath
--- PASS: iso.as-disk.uefi (37.11s)
--- PASS: iso.live-login.uefi-secure (39.32s)
--- PASS: iso.as-disk.uefi-secure (37.72s)
--- PASS: iso.offline-install-fromram.4k.uefi (92.64s)
--- PASS: iso.install (89.74s)
--- PASS: iso.offline-install.mpath (129.49s)
--- PASS: iso.live-login (55.92s)
--- PASS: iso.fips.uefi (49.65s)
--- PASS: iso.miniso-install.4k.nm.uefi (112.01s)
--- PASS: iso.as-disk (36.91s)
--- PASS: iso.pxe-online-install (92.90s)
--- PASS: iso.pxe-offline-install.4k.uefi (97.06s)
--- PASS: iso.miniso-install (94.49s)
Warning: Cannot announce submounts, client does not support it
--- PASS: iso.pxe-online-install.4k.uefi (82.69s)
--- PASS: iso.offline-install (100.28s)
--- PASS: iso.live-login.uefi (24.93s)
Warning: Cannot announce submounts, client does not support it
--- PASS: iso.pxe-offline-install.rootfs-appended (120.32s)
--- PASS: iso.miniso-install.4k.uefi (82.57s)
--- PASS: iso.offline-install-iscsi.ibft.uefi (174.86s)
--- PASS: iso.miniso-install.nm (94.50s)
--- PASS: iso.offline-install-iscsi.ibft-with-mpath (166.62s)
--- PASS: iso.offline-install-iscsi.manual (151.04s)
PASS, output in tmp/kola/qemu-2025-11-24-1353-1209

#3989

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a large but valuable refactoring that folds the kola testiso command into kola run. This improves the consistency and maintainability of the test suite. The implementation moves the ISO test logic into a new mantle/kola/tests/iso package and refactors the QEMU cluster platform API to be more extensible and modular, which is a great improvement.

I've found a few issues that could be improved:

  • There are a couple of places where panic() is used inside a goroutine for error handling, which can crash the entire test runner. It would be better to handle these errors more gracefully, for example by logging them.
  • There is a potential goroutine leak in the awaitCompletion helper function due to a time.Sleep that doesn't respect context cancellation.
  • A file path is constructed using strings.Replace, which is fragile. Using the path/filepath package would be more robust.

Overall, this is a solid refactoring. My comments are focused on improving robustness and resource management.

@nikita-dubrovskii
Copy link
Copy Markdown
Contributor Author

It's not yet 100% ready - i'm reworking metal PXE&ISO installation, so those tests still rely on code from testiso. But it'd be great to have some early feedback on work done.

@nikita-dubrovskii nikita-dubrovskii force-pushed the fold_testiso branch 3 times, most recently from 9499239 to 9816db6 Compare November 28, 2025 17:16
@nikita-dubrovskii nikita-dubrovskii changed the title WIP: fold 'kola testiso' into 'kola run' fold 'kola testiso' into 'kola run' Nov 28, 2025
@nikita-dubrovskii
Copy link
Copy Markdown
Contributor Author

nikita-dubrovskii commented Nov 28, 2025

Finished metal PXE&ISO installation. Almost ready for review.
Testes on FCOS & RHCOS:

[coreos-assembler]$ ./kola run iso.* --inst-insecure --parallel 3
⏭️  Skipping kola test pattern "ext.config.version.rhaos-pkgs-match-openshift":
  👉 https://issues.redhat.com/browse/OCPBUGS-42688
⏭️  Skipping kola test pattern "*kdump*":
  👉 https://gitlab.com/redhat/centos-stream/containers/bootc/-/issues/1169
=== RUN   iso.offline-install.mpath
=== RUN   iso.as-disk.uefi-secure
=== RUN   iso.pxe-online-install.4k.uefi
=== RUN   iso.install
=== RUN   iso.miniso-install.4k.nm.uefi
=== RUN   iso.as-disk.uefi
=== RUN   iso.live-login.uefi-secure
=== RUN   iso.live-login.uefi
=== RUN   iso.offline-install-iscsi.ibft.uefi
=== RUN   iso.as-disk
=== RUN   iso.offline-install-iscsi.manual
=== RUN   iso.pxe-offline-install.rootfs-appended
=== RUN   iso.offline-install
=== RUN   iso.fips.uefi
=== RUN   iso.live-login
=== RUN   iso.pxe-online-install
=== RUN   iso.miniso-install
=== RUN   iso.offline-install-iscsi.ibft-with-mpath
=== RUN   iso.miniso-install.nm
=== RUN   iso.miniso-install.4k.uefi
=== RUN   iso.pxe-offline-install.4k.uefi
=== RUN   iso.offline-install-fromram.4k.uefi
--- PASS: iso.miniso-install.nm (93.84s)
--- PASS: iso.offline-install.mpath (104.39s)
--- PASS: iso.offline-install (104.41s)
--- PASS: iso.live-login.uefi-secure (25.18s)
--- PASS: iso.as-disk (24.35s)
Warning: Cannot announce submounts, client does not support it
--- PASS: iso.pxe-offline-install.rootfs-appended (136.66s)
--- PASS: iso.offline-install-iscsi.manual (150.40s)
--- PASS: iso.live-login.uefi (27.27s)
--- PASS: iso.as-disk.uefi (25.60s)
--- PASS: iso.offline-install-iscsi.ibft.uefi (156.81s)
--- PASS: iso.offline-install-fromram.4k.uefi (89.36s)
--- PASS: iso.install (77.80s)
--- PASS: iso.pxe-online-install (87.25s)
Warning: Cannot announce submounts, client does not support it
--- PASS: iso.miniso-install.4k.nm.uefi (96.41s)
--- PASS: iso.pxe-online-install.4k.uefi (80.17s)
--- PASS: iso.live-login (38.12s)
--- PASS: iso.as-disk.uefi-secure (35.28s)
--- PASS: iso.offline-install-iscsi.ibft-with-mpath (154.64s)
--- PASS: iso.miniso-install (90.46s)
--- PASS: iso.fips.uefi (48.88s)
--- PASS: iso.miniso-install.4k.uefi (81.84s)
--- PASS: iso.pxe-offline-install.4k.uefi (92.00s)
PASS, output in tmp/kola/qemu-2025-11-28-1739-46

@nikita-dubrovskii
Copy link
Copy Markdown
Contributor Author

s390x run:

$ cosa kola run iso.* --parallel 2 --rerun --allow-rerun-success=tags=all --inst-insecure
⏭️   Skipping kola test pattern "basic.uefi-secure":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "iso-live-login.uefi-secure":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "iso-as-disk.uefi-secure":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "ext.config.shared.security.lockdown":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "ostree.sync":
  👉 https://github.com/openshift/os/issues/1720
⏭️   Skipping kola test pattern "ostree.sync":
  👉 https://github.com/openshift/os/issues/1751
=== RUN   iso.miniso-install.nm
=== RUN   iso.offline-install.4k
=== RUN   iso.live-login
=== RUN   iso.pxe-online-install.rootfs-appended
=== RUN   iso.offline-install
=== RUN   iso.miniso-install.4k.nm
=== RUN   iso.miniso-install
=== RUN   iso.offline-install.mpath
=== RUN   iso.pxe-offline-install
--- PASS: iso.miniso-install.nm (130.02s)
--- PASS: iso.miniso-install.4k.nm (132.90s)
--- PASS: iso.pxe-online-install.rootfs-appended (138.85s)
--- PASS: iso.pxe-offline-install (147.06s)
--- PASS: iso.live-login (44.08s)
qemu-system-s390x: -drive if=none,id=mpath10,format=raw,file=nbd:unix:/var/tmp/mantle-qemu17767552/disk2458675877.socket,media=disk,auto-read-only=off,cache=unsafe: Failed to connect to '/var/tmp/mantle-qemu17767552/disk2458675877.socket': No such file or directory
--- FAIL: iso.offline-install.mpath (31.69s)
        live-iso.go:217: unable to create test machine: failed to establish qmp connection: dial unix /var/tmp/mantle-qemu17767552/qmp-1764689554212648964.sock: connect: connection refused
--- PASS: iso.offline-install (135.35s)
--- PASS: iso.miniso-install (133.21s)
--- PASS: iso.offline-install.4k (148.88s)
FAIL, output in tmp/kola


======== Re-running failed tests (flake detection) ========

⏭️   Skipping kola test pattern "basic.uefi-secure":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "iso-live-login.uefi-secure":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "iso-as-disk.uefi-secure":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "ext.config.shared.security.lockdown":
  👉 https://github.com/openshift/os/issues/1237
⏭️   Skipping kola test pattern "ostree.sync":
  👉 https://github.com/openshift/os/issues/1720
⏭️   Skipping kola test pattern "ostree.sync":
  👉 https://github.com/openshift/os/issues/1751
=== RUN   iso.offline-install.mpath
--- PASS: iso.offline-install.mpath (152.11s)
PASS, output in tmp/kola/rerun
+ rc=0
+ set +x

@dustymabe
Copy link
Copy Markdown
Member

@nikita-dubrovskii this is an incredible amount of work. Thank you for taking this on.

This does make it hard to grok all in one go, though. I'm still struggling a bit with the first commit even. As I go I'm questioning old code that we have and I want to challenge if some of it is necessary.

I broke out some initial commits that challenge the original code (and also a few commits from this PR) into #4486

I also rebased the commits from this PR on top of that in https://github.com/dustymabe/coreos-assembler/commits/dusty-kola-testiso-fold/ if you're interested in a rebased version of this PR.

Let me know what you think.

@nikita-dubrovskii
Copy link
Copy Markdown
Contributor Author

@dustymabe Thx, checked your PR. Looks sane to me, and many thanks for better commit messages, describing purpose of changes. The only thing holding me back is dealing with rebase&merge conflicts, but not a big deal.

@dustymabe
Copy link
Copy Markdown
Member

@dustymabe Thx, checked your PR. Looks sane to me, and many thanks for better commit messages, describing purpose of changes.

Thanks!

The only thing holding me back is dealing with rebase&merge conflicts, but not a big deal.

Those should be handled already over in https://github.com/dustymabe/coreos-assembler/commits/dusty-kola-testiso-fold/

Once #4486 merges (if it gets approved) I can push that rebase over here if you like.

@nikita-dubrovskii
Copy link
Copy Markdown
Contributor Author

Once #4486 merges (if it gets approved) I can push that rebase over here if you like.

Yes, please.

@dustymabe
Copy link
Copy Markdown
Member

I did some more looking at this today. The overall strategy, not just the first commit. I'm still forming opinions and not really ready to share anything yet.

'metal.go' is only used by testiso and contains code&data structures
similar to those provided by 'machines/qemu/*.go'.
These checks are already performed by harness.go:CheckConsole.
… live-login) tests

The iso.* tests use systemd units that report success or
failure via virtio channels and always power off the machine.
Because of this, we do not need to SSH into the instance.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants