Update parameters to build the suffix-array so we can also use the mmap option#2
Open
Update parameters to build the suffix-array so we can also use the mmap option#2
Conversation
There was a problem hiding this comment.
Pull request overview
Updates the UniProt update automation to align with upcoming suffix-array builder changes (multiple output binaries for mmap support) and makes a few small reliability/housekeeping adjustments.
Changes:
- Update
sa-builderinvocation to emitsa.bin,proteins.bin, andmapping.bin. - Fix script path invocations by removing an unintended leading
/before${SCRATCH_DIR}. - Adjust dependency checks and ignore common local output directories in git.
Reviewed changes
Copilot reviewed 1 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
scripts/update-uniprot/update_uniprot.sh |
Updates suffix-array build command/outputs and tweaks repo cloning + dependency checks. |
.gitignore |
Ignores root-level data and home paths (likely local/generated). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
SimonVandeVyver
left a comment
There was a problem hiding this comment.
The script works with the mmap feature, but don't forget to change the branches in the dependencies.
|
|
||
| # Download new version of the index repo | ||
| git clone --quiet "https://github.com/unipept/unipept-index.git" "${SCRATCH_DIR:?}/unipept-index" | ||
| git clone -b feature/speedup-loading-index --quiet "https://github.com/unipept/unipept-index.git" "${SCRATCH_DIR:?}/unipept-index" |
There was a problem hiding this comment.
The script is still using the feature branch
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Important
This PR can only be merged, when all other PR's in other repositories are merged:
unipept/unipept-index#34
unipept/unipept-api#89
Before merging, the
-b feature/speedup-loading-indexhas to be removed. Right now we clone a separate branch for testing purposesThis pull request updates the
update_uniprot.shscript to improve reliability and compatibility with new features. The changes include updating command-line arguments for suffix array generation, and making minor corrections in script execution.Suffix array and index improvements:
--output-sa,--output-proteins, and--output-mapping, generating multiple binary files instead of just one.Dependency management:
build-essentialpackage has been commented out, possibly to avoid unnecessary installation steps or because it is no longer required.