Skip to content

Commit 2e55ed1

Browse files
Br1an67root
andauthored
fix(analyzer): remove erroneous anchor in Italian driver license regex (#1899)
The regex for U1-format Italian driver licenses had an erroneous ^ anchor that prevented matching when the license number was not at the start of the string. This fix removes the anchor so licenses like U1K711J11M are correctly recognized regardless of position in text. Co-authored-by: root <root@C20251020184286.local>
1 parent 91b903a commit 2e55ed1

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed

presidio-analyzer/presidio_analyzer/predefined_recognizers/country_specific/italy/it_driver_license_recognizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ class ItDriverLicenseRecognizer(PatternRecognizer):
1818
"Driver License",
1919
(
2020
r"\b(?i)(([A-Z]{2}\d{7}[A-Z])"
21-
r"|(^[U]1[BCDEFGHLJKMNPRSTUWYXZ0-9]{7}[A-Z]))\b"
21+
r"|(U1[BCDEFGHLJKMNPRSTUWYXZ0-9]{7}[A-Z]))\b"
2222
),
2323
0.2,
2424
),

presidio-analyzer/tests/test_it_driver_license_recognizer.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,10 @@ def entities() -> list[str]:
3131
("U1H00A000B", 0, (), (),),
3232
# Test with invalid Driver License
3333
("990123456B", 0, (), (),),
34+
# Test with JK letters in license (issue #1555)
35+
("U1K711J11M", 1, ((0, 10),), ((0.1, 0.4),),),
36+
# Test with JK letters not at start of string
37+
("license U1K711J11M here", 1, ((8, 18),), ((0.1, 0.4),),),
3438
# fmt: on
3539
],
3640
)

0 commit comments

Comments
 (0)