Skip to content

Add NutsProcessor for NUTS region aggregation#569

Open
dc-almeida wants to merge 21 commits intoIAMconsortium:mainfrom
dc-almeida:feature/nuts-processor
Open

Add NutsProcessor for NUTS region aggregation#569
dc-almeida wants to merge 21 commits intoIAMconsortium:mainfrom
dc-almeida:feature/nuts-processor

Conversation

@dc-almeida
Copy link
Copy Markdown
Contributor

@dc-almeida dc-almeida commented Mar 10, 2026

Closes #563. NutsProcessor performs aggregation based on the NUTS regions available in the dataframe and defined in the project configuration.
If there are regions in the data that were not previously defined in nomenclature.yaml, it raises (e.g.: if NUTS regions for Belgium are in the data, but BE is not listed in the config file).
If regions within a NUTS hierarchy are missing (e.g.: for AT11, AT111 is present but AT112 is missing), aggregation runs with the existing ones.

  • Extracted some of the aggregation logic in RegionProcessor into helper functions that are shared with NutsProcessor
  • Added API documentation and a brief guide

@dc-almeida dc-almeida added the enhancement New feature or request label Mar 10, 2026
@dc-almeida dc-almeida self-assigned this Mar 10, 2026
@dc-almeida
Copy link
Copy Markdown
Contributor Author

  • Implemented EU27(+UK) aggregation.
    • Aggregation occurs with a minimum of 23 member-states, skipped otherwise.
  • NUTS aggregation is recursive and starts from the lowest NUTS level detected (e.g.: on a dataframe with only ISO3 countries, it will skip NUTS and go directly to EU27+UK)
  • NUTS aggregation keeps all intermediate aggregations in final dataframe.
  • Attempting to aggregate a dataframe containing a NUTS region and its children will raise, since the children will aggregate to the parent and create a duplicate row, flagged by pyam

@dc-almeida
Copy link
Copy Markdown
Contributor Author

  • Updated process() to allow instantiating processors according to nomenclature.yaml
    • Explicitly-passed processors take precedence over processor in config file
    • As it's currently implemented, two processors of the same type cannot be used (e.g.: if a RegionProcessor is passed, and also defined in nomenclature.yaml, only the passed one will be used, and a log message mentioning the skip)

@dc-almeida dc-almeida marked this pull request as ready for review March 17, 2026 14:16
Comment thread nomenclature/processor/region.py Outdated
Copy link
Copy Markdown
Contributor

@phackstock phackstock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding a lot of functionality @dc-almeida. The PR looks good to me in principle. There a number of comments below but most of them are minor changes to make the code easier to read and maintain.

Comment thread nomenclature/config.py Outdated
Comment thread nomenclature/config.py Outdated
Comment thread nomenclature/config.py Outdated
Comment thread nomenclature/config.py Outdated
Comment thread nomenclature/config.py Outdated
Comment thread nomenclature/processor/nuts.py Outdated
Comment thread nomenclature/processor/nuts.py Outdated
Comment thread docs/api/nuts.rst Outdated
Comment thread nomenclature/processor/nuts.py Outdated
target_regions: list[str],
variable_codelist: VariableCodeList,
rtol_difference: float = 0.01,
return_aggregation_difference: bool = False,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not part of this PR and my own fault after consulting git blame but this variable does not do what it claims. The aggregation difference is always returned, the only thing that it does is that if the variable is set to False it additionally issues a log message detailing how to get the difference, I'm opening an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RegionProcessor for NUTS regions

3 participants