Skip to content

Merge main into docker#20

Merged
andrewscouten merged 99 commits intodockerfrom
main
Mar 15, 2026
Merged

Merge main into docker#20
andrewscouten merged 99 commits intodockerfrom
main

Conversation

@andrewscouten
Copy link
Copy Markdown
Collaborator

No description provided.

AMD and AMD + WSL
PDC CLI is for CPTAC data
So the user can pick one based on required profile
- Combines common docker configs
- Faster building on WSL
- Corrects dev-nvidia in the devcontainer
…ownloader

Add cpu to dev containers & add toml extension
…loader

[Feature] Data downloading scripts
- Added YAML error handling in XenaCohortBuilder to raise ValueError for invalid configurations.
- Filtered empty cohort names in download script to prevent processing errors.
- Initialized _full_dataset in ImageDataModule and ClinicalDataModule to improve data handling.
- Updated PillowLoader to provide more informative error messages for image loading failures.
- Improved dataset validation in MultimodalDataModule to ensure only valid labels are processed.
- Enhanced encoder classes to conditionally freeze models based on configuration settings.
- Enhanced OncoTrainer to support hyperparameter optimization (HPO) using Optuna, including a new method to run HPO and apply best parameters to the training configuration.
- Updated L1 regularization calculation in BaseOncoClassifier to only include parameters that require gradients.
- Changed the registration string for GatedLateFusionConfig to include the full module path.
- Adjusted gradient clipping value handling in OncoTrainer to ensure it is only applied when greater than zero.
- Updated dependency management in `uv.lock` to include new packages: alembic, colorlog, greenlet, mako, optuna, and sqlalchemy, along with their respective versions and dependencies.
…ic assay data; enhance API client with retry logic
…zer and loss parameter handling in YAML and code
- Introduced unit tests for the pipeline executor in `test_pipeline_executor.py`, covering various scenarios including loading data, joining datasets, and handling errors.
- Added unit tests for pipeline nodes in `test_pipeline_nodes.py`, validating default behaviors and configurations for `DataSource`, `Load`, `Join`, `Sequence`, and modality classes.
- Refactored image and multimodal data modules to improve structure and consistency in `test_image_e2e.py`, `test_multimodal_e2e.py`, and `test_tabular_e2e.py`.
- Updated configuration tests in `test_config.py` to reflect changes in the pipeline-based schema and removed deprecated modality tests.
- Consolidated data module tests in `test_datamodules.py` to focus on the new `ImageDataModule` and removed legacy tests for `GeneDataModule` and `ClinicalDataModule`.
- Enhanced the dataset registry tests in `test_registry.py` to include dataset registration and retrieval functionalities.
…e labels; refactor data modules and add Log2Normalization support
- Created train and test split files for fold 0 to fold 4 in the PAM50 and stage datasets.
- Implemented logging functionality to capture the KFold generation process, including patient counts and splits.
- Updated Docker Compose configuration to mount the configs directory for easier access within containers.
- Enhanced the kfold.py script to log output to a file while also displaying it in the console.
…iner

Add linting arguments to dev containers
@andrewscouten andrewscouten merged commit 1f1067b into docker Mar 15, 2026
1 check failed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants