DREXPA (DRug EXperimental PAnel) transforms experimental drug screening datasets into in silico drug panel and perturbations that can be test in Boolean models using the DrugLogics & Trafikk pipelines (drabme/Bless modules). It automates drug name resolution, target retrieval, node mapping, and creates drug panels and perturbation files compatible with in silico validation workflows.
For first-time users: Start with QUICKSTART.md for minimal runnable examples.
For project structure & internals: See PROJECT_STRUCTURE.md.
For maintenance & future work: See TODO.md.
DREXPA orchestrates a multi-step drug screening pipeline:
Drug Names (TXT)
↓ [chembl_ids]
ChEMBL IDs
↓ [targets]
Drug Targets (from internal DB)
↓ [node_targets] + Node Dict (CSV)
Logical Model Nodes
↓ [profiles]
Drug Profiles (with Pipeline IDs)
├→ [panel] → Drug Panel (DrugLogics format)
└→ [combinations] + Synergy Data (CSV, optional)
↓ [perturbations]
Perturbation Files (per tissue/cell line)
↓ [synergies]
Synergy Summaries
DREXPA automatically adapts to three execution modes:
- No Synergy Data: Generates drug profiles and panel; skips doses/combinations/synergies.
- With Concentration Data: Full pipeline including dose-based target extraction and combinations.
- Without Concentration Data: Profiles and panel from single-dose targets; combinations from profile mapping.
pip install drexpagit clone https://github.com/druglogics/drexpa.git
cd drexpa
pip install -e ".[dev]"
pytest # Validate installation1. Prepare input files:
data/
├── drug_names.txt # One drug per line
├── node_dict.csv # Gene → Node mapping
└── synergy_data.csv # Optional: drug combinations + effects
2. Create config file (my_config.json):
{
"global": {"output_dir": "results", "base_data_dir": "data"},
"paths": {
"drug_names_file": "drug_names.txt",
"node_dict_file": "node_dict.csv",
"synergy_data_file": "synergy_data.csv"
},
"options": {"double_drug_screen": true}
}3. Run pipeline:
# Full pipeline
drexpa --config my_config.json
# Generate profiles only
drexpa --config my_config.json --until profiles
# Profiles + panel only
drexpa --config my_config.json --steps profiles,panel
# Verbose mode (structured logging + timing)
drexpa --config my_config.json --verboseFor detailed walkthrough, see QUICKSTART.md.
drexpa --help| Option | Description | Example |
|---|---|---|
--config PATH |
Custom config JSON file | --config my_config.json |
--synergy-data PATH |
Override synergy data file | --synergy-data screen.csv |
--output-dir DIR |
Override output directory | --output-dir ./results |
--until STEP |
Run until step (inclusive) | --until panel |
--steps STEPS |
Specific steps (comma-separated) | --steps profiles,panel |
--verbose |
Enable structured logging + timing | --verbose |
--version |
Show version | |
--help |
Show full help |
load_data– Load synergy datachembl_ids– Resolve ChEMBL IDs (requires ChEMBL network access)doses– Extract & process drug doses (requires concentration columns)targets– Query internal drug-target databasenode_targets– Map targets to logical model nodesprofiles– Generate unique drug profiles with Pipeline IDscombinations– Prepare drug combinations (requires synergy data)panel– Create drug panel (DrugLogics format)perturbations– Generate perturbation files per conditionsynergies– Process observed synergies
Dependencies are resolved automatically: --until panel runs steps 1–8 with all prerequisites.
For step execution flow & dependency resolution: See PROJECT_STRUCTURE.md § Step Definitions.
Use DREXPA programmatically:
from drexpa import run_pipeline
config = {
"global": {"output_dir": "results"},
"paths": {
"drug_names_file": "data/drugs.txt",
"node_dict_file": "data/nodes.csv",
}
}
# Full pipeline
run_pipeline(config_dict=config)
# Specific steps
run_pipeline(
config_dict=config,
synergy_data_file="data/synergy.csv",
steps_to_run=["profiles", "panel"]
)For orchestration details & entry points: See PROJECT_STRUCTURE.md § Entry Points.
Default config structure (see drexpa.config.get_default_config()):
{
"global": {
"output_dir": "output",
"verbose": false,
"save": true,
"base_data_dir": "data"
},
"paths": {
"drug_names_file": "drug_names.txt",
"synergy_data_file": "synergy_data.csv",
"node_dict_file": "node_dict.csv",
"tissue_cline_file": "tissue_cline.csv",
"db_file": null,
"manual_chembl_csv": "manual_chembl.csv"
},
"columns": {
"drug_name": "drug_name",
"drug_name_A": "drug_name_A",
"drug_name_B": "drug_name_B",
"conc_A": "conc_A",
"conc_B": "conc_B",
"cell_line": "cell_line",
"synergy": "synergy"
},
"options": {
"synergy_threshold": 0.0,
"double_drug_screen": true,
"original_target_merge": "fill_missing"
}
}Key points:
db_file: null→ Uses internal package database (managed by project, not user).base_data_dir– Base path; all relative paths are resolved against it.columns– Customize column names if your data uses different headers.- Deep-merge override: Custom config merges recursively with defaults; only specified sections override.
For configuration flow & per-step config builders: See PROJECT_STRUCTURE.md § Configuration Flow.
By default, outputs are written to output_dir:
| File | Step | Content |
|---|---|---|
drug_ChEMBL_IDs.csv |
chembl_ids |
Drug name → ChEMBL ID mapping |
drug_ChEMBL_doses.csv |
doses |
Drugs with IC50 concentrations |
drug_ChEMBL_targets.csv |
targets |
Drug → Targets from database |
drug_node_targets.csv |
node_targets |
Drug → Logical model nodes |
drug_profiles.csv |
profiles |
Drug profiles with Pipeline IDs |
drug_panel_df.csv |
panel |
Drug panel (DrugLogics format) |
drugpanel |
panel |
Formatted drug panel file |
<TISSUE>/ |
perturbations |
Per-tissue perturbation files |
Error: FileNotFoundError: Preflight validation failed. Missing required files
Fix: Check file paths in config. Ensure base_data_dir is correct.
Error: ValueError: Missing required columns in synergy data
Fix: Verify columns section in config matches your data headers. Use --verbose to see exact missing columns.
Error: Network timeout or no results for drug name
Fix:
- Check drug name spelling (must match ChEMBL exactly or be unambiguous).
- Provide manual ChEMBL mapping in
manual_chembl.csvto skip network queries. - Run with
--verboseto see which drugs failed.
Error: FileNotFoundError: Internal drug-target interaction database not found
Fix: Indicates broken DREXPA installation. Reinstall: pip install --upgrade drexpa
Provide a CSV to override ChEMBL resolution (skip network queries):
drug_name,ChEMBL_ID
Aspirin,CHEMBL25
Ibuprofen,CHEMBL521Configure in paths.manual_chembl_csv.
Configs deep-merge with defaults, so you only specify changes:
{
"global": {"output_dir": "custom_results"},
"paths": {"drug_names_file": "my_drugs.txt"}
}Unspecified keys inherit defaults; all section keys must remain intact.
Enable structured logs + per-step timing:
drexpa --config my_config.json --verboseOutput includes:
- Step start/end timestamps
- Per-step duration (seconds)
- Preflight warnings & validation details
- Pipeline summary
Plain text, one drug per line:
Aspirin
Ibuprofen
Paracetamol
Gene/protein symbols mapped to logical model node names:
gene,node
EGFR,EGFR_node
TP53,p53
BRAF,BRAF_nodeWith concentration data (dual-drug screening):
drug_name_A,drug_name_B,conc_A,conc_B,tissue,cell_line,synergy
Aspirin,Ibuprofen,1.0,2.0,Breast,MCF7,0.15Without concentration data (single-dose combinations):
drug_name_A,drug_name_B,tissue,cell_line,synergy
Aspirin,Ibuprofen,Breast,MCF7,0.15tissue,cell_line
Breast,MCF7
Breast,T47D
Colorectal,HCT116The drug-target interaction database is shipped with the package. To update:
- Replace
drexpa/resources/DrugTargetInteractionDB.dbwith new database file. - Update version in
pyproject.toml. - Add changelog entry.
- Rebuild and release:
pip install build; python -m build; twine upload dist/*
pytest tests/ -q # Quick run
pytest tests/ --cov # Coverage reportSee PROJECT_STRUCTURE.md for module architecture and TODO.md for planned improvements.
MIT License. See LICENSE file for details.
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Run
pytestto validate - Submit pull request
If DREXPA is helpful in your research, please cite:
@software{drexpa2024,
title = {DREXPA: Drug Experimental Panel Generator},
author = {Bermudez Paiva, Viviam},
year = {2024},
url = {https://github.com/druglogics/drexpa}
}For issues, feature requests, or questions: GitHub Issues
- QUICKSTART.md – Two minimal runnable examples (with & without concentrations)
- PROJECT_STRUCTURE.md – Module architecture, entry points, legacy code
- TODO.md – Roadmap (P2 architecture improvements, caching, extensibility)