Skip to content

ITS: staggering [DO NOT MERGE]#15188

Open
f3sch wants to merge 56 commits intoAliceO2Group:devfrom
f3sch:its/trk/stag
Open

ITS: staggering [DO NOT MERGE]#15188
f3sch wants to merge 56 commits intoAliceO2Group:devfrom
f3sch:its/trk/stag

Conversation

@f3sch
Copy link
Collaborator

@f3sch f3sch commented Mar 18, 2026

This is to run the CI and possibly to run tests at P2.

f3sch and others added 25 commits March 17, 2026 17:37
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Adapt ITS/MFT CTF machinery to staggered data
Fix compilation of ALICE3 tracking with staggering
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
@github-actions
Copy link
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
@f3sch f3sch marked this pull request as ready for review March 18, 2026 11:26
@f3sch f3sch requested review from sawenzel and shahor02 as code owners March 18, 2026 11:26
f3sch added 5 commits March 19, 2026 13:18
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
@f3sch
Copy link
Collaborator Author

f3sch commented Mar 19, 2026

Tested workflow option for staggering with:

#!/usr/bin/env bash
set -euo pipefail

GLOSET=" -b --shm-segment-size 12000000000 --timeframes-rate-limit 6 --timeframes-rate-limit-ipcid $RANDOM  "
ITSAPT=";ITSAlpideParam.roFrameLayerDelayInBC[0]=0;ITSAlpideParam.roFrameLayerDelayInBC[1]=0;ITSAlpideParam.roFrameLayerDelayInBC[2]=0;ITSAlpideParam.roFrameLayerDelayInBC[3]=0;ITSAlpideParam.roFrameLayerDelayInBC[4]=0;ITSAlpideParam.roFrameLayerDelayInBC[5]=0;ITSAlpideParam.roFrameLayerDelayInBC[6]=0;ITSAlpideParam.roFrameLayerLengthInBC[0]=99;ITSAlpideParam.roFrameLayerLengthInBC[1]=99;ITSAlpideParam.roFrameLayerLengthInBC[2]=99;ITSAlpideParam.roFrameLayerLengthInBC[3]=198;ITSAlpideParam.roFrameLayerLengthInBC[4]=198;ITSAlpideParam.roFrameLayerLengthInBC[5]=198;ITSAlpideParam.roFrameLayerLengthInBC[6]=198;"
ITSCLS=";ITSClustererParam.maxBCDiffToMaskBias=-10;ITSClustererParam.maxBCDiffToSquashBiasLayer[0]=100;ITSClustererParam.maxBCDiffToSquashBiasLayer[1]=100;ITSClustererParam.maxBCDiffToSquashBiasLayer[2]=100;ITSClustererParam.maxBCDiffToSquashBiasLayer[3]=10;ITSClustererParam.maxBCDiffToSquashBiasLayer[4]=10;ITSClustererParam.maxBCDiffToSquashBiasLayer[5]=10;ITSClustererParam.maxBCDiffToSquashBiasLayer[6]=10;"
ITSTRK=";ITSVertexerParam.phiCut=0.5;ITSVertexerParam.clusterContributorsCut=3;ITSVertexerParam.tanLambdaCut=0.2;ITSVertexerParam.nThreads=3;ITSCATrackerParam.nThreads=3;;;ITSCATrackerParam.sysErrY2[0]=100e-8;ITSCATrackerParam.sysErrZ2[0]=100e-8;ITSCATrackerParam.sysErrY2[1]=100e-8;ITSCATrackerParam.sysErrZ2[1]=100e-8;ITSCATrackerParam.sysErrY2[2]=100e-8;ITSCATrackerParam.sysErrZ2[2]=100e-8;ITSCATrackerParam.sysErrY2[3]=100e-8;ITSCATrackerParam.sysErrZ2[3]=100e-8;ITSCATrackerParam.sysErrY2[4]=100e-8;ITSCATrackerParam.sysErrZ2[4]=100e-8;ITSCATrackerParam.sysErrY2[5]=100e-8;ITSCATrackerParam.sysErrZ2[5]=100e-8;ITSCATrackerParam.sysErrY2[6]=100e-8;ITSCATrackerParam.sysErrZ2[6]=100e-8;;;ITSClustererParam.maxBCDiffToMaskBias=-10;ITSClustererParam.maxBCDiffToSquashBias=10;ITSVertexerParam.phiCut=0.5;ITSVertexerParam.clusterContributorsCut=3;ITSVertexerParam.tanLambdaCut=0.2;;ITSCATrackerParam.startLayerMask[0]=127;ITSCATrackerParam.startLayerMask[1]=127;ITSCATrackerParam.startLayerMask[2]=127;;ITSCATrackerParam.minPtIterLgt[0]=0.05;ITSCATrackerParam.minPtIterLgt[1]=0.05;ITSCATrackerParam.minPtIterLgt[2]=0.05;ITSCATrackerParam.minPtIterLgt[3]=0.05;ITSCATrackerParam.minPtIterLgt[4]=0.05;ITSCATrackerParam.minPtIterLgt[5]=0.05;ITSCATrackerParam.minPtIterLgt[6]=0.05;ITSCATrackerParam.minPtIterLgt[7]=0.05;ITSCATrackerParam.minPtIterLgt[8]=0.05;ITSCATrackerParam.minPtIterLgt[9]=0.09;ITSCATrackerParam.minPtIterLgt[10]=0.167;ITSCATrackerParam.minPtIterLgt[11]=0.125;ITSCATrackerParam.trackingMode=1;"
o2-raw-tf-reader-workflow --onlyDet ITS --input-data o2_rawtf_run00569789_tf00105629_epn121.tf $GLOSET |
        o2-itsmft-stf-decoder-workflow $GLOSET --configKeyValues=";$ITSAPT;$ITSCLS" --decoder-verbosity 0 --enable-its-staggering |
        o2-its-reco-workflow --disable-mc --clusters-from-upstream --configKeyValues=";$ITSAPT;$ITSCLS;$ITSTRK;" $GLOSET --enable-its-staggering

ITSAPT=";ITSAlpideParam.roFrameLengthInBC=198;"
ITSCLS=";ITSClustererParam.maxBCDiffToMaskBias=-10;ITSClustererParam.maxBCDiffToSquashBias=10;"
o2-raw-tf-reader-workflow --onlyDet ITS --input-data o2_rawtf_run00562272_tf01594929_epn117.tf $GLOSET |
        o2-itsmft-stf-decoder-workflow $GLOSET --configKeyValues=";$ITSAPT;$ITSCLS" --decoder-verbosity 0 |
        o2-its-reco-workflow --disable-mc --clusters-from-upstream --configKeyValues=";$ITSAPT;$ITSCLS;$ITSTRK;" $GLOSET

f3sch added 3 commits March 19, 2026 13:54
Fixes in adding staggering options + propagate to dpl-workflow.sh
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
@alibuild
Copy link
Collaborator

alibuild commented Mar 19, 2026

Error while checking build/O2/fullCI_slc9 for 6a27708 at 2026-03-20 09:28:

## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
grep: error-log.txt: binary file matches
/sw/SOURCES/O2/15188-slc9_x86-64/0/Common/DCAFitter/GPU/cuda/GPUInterface.cu:55:15: error: use '= default' to define a trivial destructor [modernize-use-equals-default]
++ [[ 0 == 0 ]]
++ exit 1
--

Full log here.

shahor02 and others added 8 commits March 20, 2026 14:26
To activate ITS or MFT staggering in the topology generation, export ITSSTAGGERED=1
or MFTSTAGGERED=1 respectively
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
To pass the sim-challenge test. W/o this option even <workflow> -h leads to a crash.
Strictly speaking, one could use in the DPLAlpideParamInitializer::isITSStaggeringEnabled
and DPLAlpideParamInitializer::isMFTStaggeringEnabled a test
ic.options().hasOption(stagITSOpt) and ic.options().hasOption(stagMFTOpt) before testing
the option itself. But better to have an explicit detection of missing staggering option.
Make ITS vertex messageable + other fixes
@f3sch f3sch requested review from bazinski and wille10 as code owners March 20, 2026 13:58
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
@alibuild
Copy link
Collaborator

alibuild commented Mar 20, 2026

Error while checking build/O2/fullCI_slc9 for 1bcddd0 at 2026-03-24 11:25:

## sw/BUILD/O2-full-system-test-latest/log
command /sw/slc9_x86-64/O2/15188-slc9_x86-64-local2/prodtests/full-system-test/dpl-workflow.sh had nonzero exit code 1
[9944:BadMapCalibSpec]: [11:25:03][ERROR] Insufficient statistics: 1 entries in lowE histo, do nothing
[10344:qc-task-ITS-ITSTrackTask]: [11:25:45][ERROR] Exception while running: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 364. Rethrowing.
[10344:qc-task-ITS-ITSTrackTask]: [11:25:45][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 364
[ERROR] Workflow crashed - PID 10344 (qc-task-ITS-ITSTrackTask) did not exit correctly however it's not clear why. Exit code forced to 128.
[ERROR]  - Device qc-task-ITS-ITSTrackTask: pid 10344 (exit 128)
[INFO]    - First error: [11:25:45][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 364
[ERROR] SEVERE: Device qc-task-ITS-ITSTrackTask (10344) had at least one message above severity 7: Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 364


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
grep: error-log.txt: binary file matches
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/6706a09508860fd1d81447db33e8f7ca556e3c6a/slc9_x86-64/o2checkcode/1.0-local328/etc/modulefiles
++ cat
--

Full log here.

f3sch added 3 commits March 24, 2026 11:04
Remove leftover NROFs configurable from dpl-workflow.sh
Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
Comparing the output of dev and this PR, I saw plently of cases where
the system of equation was fully degenerate and produced to different
floating instructions and compiler optimizations slightly different
results. The solution is to discard the vertex cand. if the LSE becomes
degenerate as not to produce non-sense solutions.

Signed-off-by: Felix Schlepper <felix.schlepper@cern.ch>
@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for f13ffb4 at 2026-03-24 18:10:

## sw/BUILD/o2codechecker-latest/log
100% tests passed, 0 tests failed out of 1


## sw/BUILD/O2-full-system-test-latest/log
command /sw/slc9_x86-64/O2/15188-slc9_x86-64-local2/prodtests/full-system-test/dpl-workflow.sh had nonzero exit code 1
[9952:BadMapCalibSpec]: [18:09:42][ERROR] Insufficient statistics: 1 entries in lowE histo, do nothing
[10371:qc-task-ITS-ITSTrackTask]: [18:10:13][ERROR] Exception while running: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 312. Rethrowing.
[10371:qc-task-ITS-ITSTrackTask]: [18:10:13][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 312
[ERROR] Workflow crashed - PID 10371 (qc-task-ITS-ITSTrackTask) did not exit correctly however it's not clear why. Exit code forced to 128.
[ERROR]  - Device qc-task-ITS-ITSTrackTask: pid 10371 (exit 128)
[INFO]    - First error: [18:10:13][FATAL] Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 312
[ERROR] SEVERE: Device qc-task-ITS-ITSTrackTask (10371) had at least one message above severity 7: Unhandled o2::framework::runtime_error reached the top of main of o2-qc, device shutting down. Reason: Inconsistent type and payload size at Vertices(DS/tracks1/0): type size 48  payload size 312


## sw/BUILD/o2checkcode-latest/log
--
========== List of errors found ==========
++ GRERR=0
++ grep -v clang-diagnostic-error error-log.txt
++ grep ' error:'
grep: error-log.txt: binary file matches
++ GRERR=1
++ [[ 1 == 0 ]]
++ mkdir -p /sw/INSTALLROOT/318a05c11fbfd7f26eacda467f1ad755bec6e8ab/slc9_x86-64/o2checkcode/1.0-local28/etc/modulefiles
++ cat
--

Full log here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants