optimize complexity of filter out unwanted recognizers from O(n*m ) to O(n)#1523
optimize complexity of filter out unwanted recognizers from O(n*m ) to O(n)#1523rgupta2508 wants to merge 18 commits intomicrosoft:mainfrom
Conversation
|
@microsoft-github-policy-service agree |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
presidio-analyzer/presidio_analyzer/recognizer_registry/recognizer_registry.py
Outdated
Show resolved
Hide resolved
|
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
1 similar comment
|
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
|
@rgupta2508 after some thinking, we've decided to close this PR. We appreciate the change, but in the tradeoff between computational efficiency and code complexity, we've decided that it's a bit risky. The number of combinations here (entities, languages, recognizers) is not high therefore we don't expect a significant performance boost here. We would be happy to be corrected otherwise if you did any analysis on this. |
|
Hi @omri374, |
|
Hi @rgupta2508 Thanks for this input! We will continue reviewing this and get back to you. |
|
Hi @omri374 Any updates regarding this PR? |
|
hi @rgupta2508, apologies for the delay |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
presidio-analyzer/presidio_analyzer/recognizer_registry/recognizer_registry.py
Show resolved
Hide resolved
|
this pr is still active ? @omri374 @rgupta2508 |
Reopen this PR #1508
Creating map of supported entity and recognizers and filtering out based on entity name using kay .
Code refactor.
Improve complexity of filter out unwanted recognizers from O(n*m )to O(n)
In current code for loop is running inside for loop that makes complexity to O(n*m). after this change there is only one loop by using one extra variable.
where
m = total number of all_possible_recognizers
n= total supported entities which we want to include .
Change Description
Describe your changes
Issue reference
This PR fixes issue #XX
Checklist