Add validation for include filters in CodeList#573
Add validation for include filters in CodeList#573dc-almeida wants to merge 20 commits intoIAMconsortium:mainfrom
Conversation
|
Added validation for mappings, as a necessary step when reading nomenclature.yaml (if mappings files include regions not present in external repos, raises). |
|
As an aside to this PR, my two cents on mocking external resources:
I'm always torn when it comes to mocking external resources/services/etc... On the one hand I agree that in order to speed up the tests, it's a good idea to mock things. On the other hand sometimes what you want is precisely an integration type test where you actually test the interaction of your piece of code with external resources. |
There was a problem hiding this comment.
Pull request overview
Adds validation to catch typos/mismatches in nomenclature.yaml include/exclude filters by ensuring include filters/patterns actually match existing codes/mappings, with accompanying regression tests.
Changes:
- Add validation that repository include filters in
CodeListmatch at least one code, raising anExceptionGroupotherwise. - Add validation that mapping include patterns match at least one model in the mappings repository.
- Add tests and new fixture config files to cover wrong nesting level, missing codes, missing hierarchies, and missing mappings.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
nomenclature/codelist.py |
Validates include filters against loaded codes and updates filtering types/signature. |
nomenclature/config.py |
Adds config-level validation for mapping include patterns after fetching repos. |
nomenclature/processor/region.py |
Collects mapping parse errors from external repos instead of failing fast. |
tests/test_codelist.py |
Adds tests asserting missing-code/missing-hierarchy includes raise grouped errors. |
tests/test_config.py |
Adds tests for include/exclude at wrong YAML nesting level and missing mapping includes. |
tests/data/config/*.yaml |
New config fixtures for the new validation test cases. |
pyproject.toml / poetry.lock |
Updates ruff dev dependency and lockfile. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Agreed, the suggestion here would be to apply this for those tests that precisely are only importing external repos to access their data but where what matters is the data and not the connection. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
danielhuppmann
left a comment
There was a problem hiding this comment.
I used this feature this morning when implementing iiasa/prisma-workflow#18, very useful! But the error messages could be more precise, see some suggestion inline. And it's not clear to me how models are validated.
| for exp in expected: | ||
| assert excinfo.group_contains( | ||
| ValueError, | ||
| match=r"No codes found for include filter: " + exp, |
There was a problem hiding this comment.
| match=r"No codes found for include filter: " + exp, | |
| match=r"No regions found for include-filter: " + exp, |
| ) | ||
| try: | ||
| with pytest.RaisesGroup( | ||
| ValueError, ValueError, match="Include filter validation failed" |
There was a problem hiding this comment.
| ValueError, ValueError, match="Include filter validation failed" | |
| ValueError, ValueError, match="Importing regions from external repository failed" |
There was a problem hiding this comment.
I think it will have to be a more generic "Importing definitions..." because other dimensions can also be used in the include filter and the method is applicable to any type of codelist. It still distinguishes from definitions vs. mappings and the name of a variable vs. a region will be self-evident.
|
|
||
| try: | ||
| with pytest.RaisesGroup( | ||
| ValueError, match="Include filter validation failed" |
There was a problem hiding this comment.
| ValueError, match="Include filter validation failed" | |
| ValueError, match="Importing mappings from external repository failed" |
| for exp in expected: | ||
| assert excinfo.group_contains( | ||
| ValueError, | ||
| match=r"No codes found for include filter: " + exp, |
There was a problem hiding this comment.
| match=r"No codes found for include filter: " + exp, | |
| match=r"No mapping found for include-filter: " + exp, |
| repo.fetch_repo(target_folder / repo_name) | ||
|
|
||
| def validate_mapping_includes(self) -> None: | ||
| """Validate that all mapping include patterns match at least one model.""" |
There was a problem hiding this comment.
This is not clear to me, where is the list of models coming from?
There was a problem hiding this comment.
The all_models variable, the list of models is compiled from reading all external repo mappings files. So it checks all the models listed on all imported mappings files to see if there are filters that don't apply to any, and if so, raises.
It's not the most effective way (when a RegionProcessor is instantiated, the mapping files are read again), but without some big refactoring it was the only way I saw of raising the error as soon as possible (when nomenclature.yaml is read) and where all the necessary info was gathered (the include filter and the repo mappings to match against).
Co-authored-by: Daniel Huppmann <dh@dergelbesalon.at>
Closes #551. Introduce validation for
includefilters used in nomenclature.yaml for codelists and mappings to ensure that all specified filters correspond to existing codes.For codelists,
includefilters by code attributes, for mappings it filters by model name.