Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9161d6c
Add all_atb_files.tsv
samnooij Dec 5, 2025
07ec3eb
Update roadmap
samnooij Dec 5, 2025
25e19b9
Remove outdated CCTyper parsing
samnooij Dec 12, 2025
6c8156c
Reformat rule all
samnooij Dec 12, 2025
ef9e81c
Apply personal Bash style
samnooij Dec 12, 2025
5e23fbe
Adapt Snakemake recommended data structure
samnooij Dec 12, 2025
6282375
Apply standardised formatting (snakefmt/styler)
samnooij Dec 12, 2025
bcd6765
Move long commands to scripts
samnooij Dec 12, 2025
b237420
Several small code corrections
samnooij Dec 12, 2025
5156adb
Apply strict mode
samnooij Dec 12, 2025
aa108f4
Use file paths as recommended
samnooij Dec 16, 2025
db6bfd3
Remove spaces at end of lines
samnooij Dec 16, 2025
0a15dc8
Fix typo
samnooij Dec 17, 2025
e0664af
Correct subpath syntax
samnooij Dec 17, 2025
c6895ee
Fix typo
samnooij Dec 17, 2025
d27af01
Bring back rule that is still required
samnooij Dec 17, 2025
8c92864
Move downloading SpacePHARER databases to workflow
samnooij Dec 17, 2025
5f52529
Move directory variable to params
samnooij Dec 17, 2025
e06070c
Add missing conda YAML file
samnooij Dec 17, 2025
b4888fd
Add conda env in all rules
samnooij Dec 17, 2025
10cf36e
Move ATB to resources directory
samnooij Dec 17, 2025
1668c60
Fix typo for subdirectories CCTyper output
samnooij Dec 17, 2025
f61f6e0
Add comments
samnooij Jan 7, 2026
f804786
Set log and benchmark files for merge_cctyper_identify
samnooij Jan 7, 2026
901e83e
Fix #24: update cluster table parsing script
samnooij Jan 8, 2026
5f1d732
Adopt Snakemake's recommended structure
samnooij Jan 13, 2026
12c64cf
Revise automated database setup for SpacePHARER
samnooij Jan 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 35 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,51 +26,41 @@ which includes a quick start guide as well as a detailed step-by-step descriptio

## Release roadmap

- Version 0.1 updates:

- Updates in documentation of implemented functions and output files
(e.g., expanding documentation on the use of CRISPRidentify)

- Functions to concatenate and combine existing outputs

- Code clean-up (moving long commands to separate scripts,
applying standardised formatting, remove unnecessary code)

- Versions 0.2 and further:

- New functionality (e.g., all-vs-all genome comparisons)

## To do

- [x] Add CRISPRidentify to workflow

- [x] And make sure it works on a clean install

- [x] Combine results from CCTyper with CRISPRidentify

- [ ] Make and/or correct scripts for combining results into 'Output files' (write to `data/processed/`)

- [x] Concatenate MLST results

- [x] Enable spacer table creation script in Snakefile (add to `rule all`)

- [x] Collect and combine results from geNomad and Jaeger

- [x] Map spacers to genomes and phage/plasmid databases

- [x] Add PADLOC for identifying other anti-phage systems

- [ ] Write documentation for output files

- [x] Rewrite 'Problems encountered' into a rationale for our tool selection (as separate document)

- [x] Write detailed and technical step-by-step description of the workflow

- [ ] While reviewing the workflow, remove unnecessary pieces and clean-up where possible

- [x] Setup MkDocs-powered documentation (at least locally, integrate with GitHub pages later)

(_Note to self: Remove this list when finished and use issues or roadmap instead!_)
- Version 0.2: tidy code
- remove outdated steps
- move long commands in Snakefile to separate script
- (re)apply linting (Black, Styler)
- apply 'bash strict mode' and suppress R messages
- move parts of Snakefile to separate scripts?

- Version 0.3: solid foundation
- validate proper functioning of CCTyper + CRISPRidentify
- adjust helper scripts where necessary
- correct scripts for making tables, integrate with Snakemake and test!
- CRISPR spacer table (#24)
- CRISPR-Cas locus table
- (make sure every analysis part produces an output: include in 'rule all')

- Version 0.4: clear documentation
- review and update README and docs

- Future additions:
- genome deduplication (dRep)
- CRISPR spacer target prediction
- map to
- masked ATB genomes
- PLSDB
- PhageScope
- VIRE
- MEGAISurv metagenomes
- mini-benchmark different mapping algorithms
- Sassy
- KMA
- SpacePHARER
- (where feasible) connect spacer hits with functional annotations!
- Integrate downstream analyses with Snakemake?
- run RMarkdown/Quarto notebooks automatically
- build a database like [this spacerdb](https://spacers.jgi.doe.gov/database/overview/)?

## Workflow description

Expand Down
Loading