Skip to content

fix(analyzer): remove erroneous anchor in Italian driver license regex#1899

Merged
SharonHart merged 1 commit intomicrosoft:mainfrom
Br1an67:fix/issue-1555-italian-license-regex
Mar 9, 2026
Merged

fix(analyzer): remove erroneous anchor in Italian driver license regex#1899
SharonHart merged 1 commit intomicrosoft:mainfrom
Br1an67:fix/issue-1555-italian-license-regex

Conversation

@Br1an67
Copy link
Contributor

@Br1an67 Br1an67 commented Mar 9, 2026

Fixes #1555

Change Description

Fixed the regex pattern in ItDriverLicenseRecognizer by removing an erroneous ^ anchor that prevented matching Italian driver license numbers (U1 format) when they appeared anywhere other than the start of a string.

The regex for U1-format licenses incorrectly had ^[U]1 which only matched at the beginning of a line. This has been corrected to U1 so licenses like U1K711J11M are now correctly recognized regardless of their position in the text.

Issue reference

Fixes #1555

Checklist

  • I have reviewed the contribution guidelines
  • I have signed the CLA (if required)
  • My code includes unit tests
  • All unit tests and lint checks pass locally
  • My PR contains documentation updates / additions if required

Changes

  • presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/italy/it_driver_license_recognizer.py: Removed erroneous ^ anchor from the U1-format regex pattern
  • presidio-analyzer/tests/test_it_driver_license_recognizer.py: Added test cases for license numbers containing J/K letters and for licenses not at the start of text
 presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/italy/it_driver_license_recognizer.py | 2 +-
 presidio-analyzer/tests/test_it_driver_license_recognizer.py                     | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

The regex for U1-format Italian driver licenses had an erroneous ^ anchor
that prevented matching when the license number was not at the start of
the string. This fix removes the anchor so licenses like U1K711J11M are
correctly recognized regardless of position in text.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes the Italian driver license recognizer so U1-format licenses are detected even when they appear mid-string (issue #1555).

Changes:

  • Removed an unintended ^ start-of-string anchor from the U1 driver license regex.
  • Added unit tests to cover U1 licenses containing J/K and licenses appearing later in text.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/italy/it_driver_license_recognizer.py Updates the U1 regex alternative to no longer require start-of-string matching.
presidio-analyzer/tests/test_it_driver_license_recognizer.py Adds regression tests for the fixed matching behavior and for J/K letters in U1 licenses.

You can also share your feedback on Copilot code review. Take the survey.

@SharonHart
Copy link
Contributor

@samirabu

@SharonHart SharonHart merged commit 2e55ed1 into microsoft:main Mar 9, 2026
39 checks passed
Br1an67 added a commit to Br1an67/presidio that referenced this pull request Mar 18, 2026
microsoft#1899)

The regex for U1-format Italian driver licenses had an erroneous ^ anchor
that prevented matching when the license number was not at the start of
the string. This fix removes the anchor so licenses like U1K711J11M are
correctly recognized regardless of position in text.

Co-authored-by: root <root@C20251020184286.local>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Italian driver license recognizer is faulty

3 participants