[Python] A "personal data" boolean in field metadata

### Describe the enhancement requested

Hello,

As data increasingly moves across organizational and regulatory boundaries, data sensitivity is becoming just as important as data type. Today, teams often need to answer questions like:
-Does this dataset contain personal data?
-Which specific fields are subject to GDPR/CCPA or internal governance rules?
-Can this column be safely logged, cached, or shared downstream?

In practice, this information is either:
Stored out-of-band (data catalogs, documentation), or
Embedded in ad-hoc metadata conventions that vary by organization and tool.

A simple, standardized personal_data boolean at the field level would provide a lightweight, interoperable signal that many tools could immediately benefit from.

Field-level granularity is essential: most real datasets mix personal and non-personal columns.
A boolean keeps the signal intentionally minimal.
This would enable:
Automatic detection and propagation of personal data flags across Arrow-compatible systems
Safer defaults in query engines, serializers, and exporters
Easier integration with data catalogs, lineage tools, and privacy audits
Consistent behavior across Arrow consumers

Crucially, this does not enforce semantics or compliance — it simply provides a common language.

This would be entirely optional and backward-compatible.
It does not preclude richer classifications in external systems.
It aligns with Arrow’s existing use of key/value metadata without introducing new core types.


Edit : we can also imagine having the same boolean in the table metadata if the table contains at least one personal data field.
Best regards,

Simon

### Component(s)

Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Python] A "personal data" boolean in field metadata #48959

Describe the enhancement requested

Component(s)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Python] A "personal data" boolean in field metadata #48959

Description

Describe the enhancement requested

Component(s)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions