Chore/undefined aware base model by kelvin-aipolabs · Pull Request #159 · aipotheosis-labs/gate22

kelvin-aipolabs · 2025-10-08T17:32:36Z

📝 Description

Adding a base class for help differentiating "update to null" and "no change" during inputs for partial DB updates.

Summary by cubic

Adds an UndefinedAwareBaseModel to clearly separate “set to null” from “no change” in partial DB updates. It also safeguards model_dump usage to avoid accidentally including unset fields.

New Features
- New Pydantic base model with non-nullable field validation via a custom model_validator.
- Uses model_fields_set to detect which fields were explicitly provided.
- Overrides model_dump to require/encourage exclude_unset; configurable WARN or ERROR behavior.

Summary by CodeRabbit

New Features
- Smarter handling of undefined vs. null fields in data models for more predictable validation.
- Configurable behavior when exporting data without specifying unset handling, with optional warnings or errors to prevent accidental data loss.
Documentation
- Expanded inline documentation explaining new model behavior and configuration.
Chores
- Added logging to surface misconfigurations during data export.
- Introduced internal configuration points to control serialization and validation behavior.

vercel · 2025-10-08T17:32:45Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
gate22	Ready	Preview	Comment	Oct 8, 2025 5:39pm

coderabbitai · 2025-10-08T17:33:00Z

Walkthrough

Introduces a new Pydantic base model that distinguishes undefined vs. null, enforces non-nullable checks only for fields explicitly provided, and customizes model_dump to warn or raise when exclude_unset is omitted via a configurable enum and logging.

Changes

Cohort / File(s)	Summary
Pydantic undefined-aware base model `backend/aci/common/schemas/undefined_aware_base_model.py`	Added `UndefinedAwareBaseModel` implementing `_non_nullable_fields` and `_dump_without_exclude_unset_behavior`, a `BehaviorOnDumpWithoutExcludeUnset` enum, post-validation `validate_non_nullable_fields` (runs only for fields in `model_fields_set`), and an overridden `model_dump` that warns or raises if `exclude_unset` is not provided. Includes logger and inline docs.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Caller
  participant Model as UndefinedAwareBaseModel
  participant Pydantic as Pydantic Core

  Caller->>Model: instantiate(**data)
  Model->>Pydantic: validate input
  Pydantic-->>Model: validated instance
  Note over Model: Post-validation hook
  Model->>Model: validate_non_nullable_fields()\n(only checks fields in model_fields_set)
  Model-->>Caller: instance

sequenceDiagram
  autonumber
  actor Caller
  participant Model as UndefinedAwareBaseModel
  participant Logger as Logger

  Caller->>Model: model_dump(exclude_unset=?)
  alt exclude_unset provided
    Model-->>Caller: dict per args
  else exclude_unset not provided
    Model->>Model: check _dump_without_exclude_unset_behavior
    alt Behavior = WARN
      Model->>Logger: warn about missing exclude_unset
      Model-->>Caller: dict via super().model_dump(**kwargs)
    else Behavior = RAISE
      Model-->>Caller: raise RuntimeError / ValueError
    end
  end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

I hop through fields of None and set,
I know which keys you gave, not yet—
I nudge with warnings, or loudly cry,
When dumps forget exclude_unset’s eye.
A rabbit validator, tidy and spry. 🐇

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title references the new undefined-aware base model, which is the core change introduced, but the “Chore/” prefix and its brevity make it somewhat generic and omit the model’s purpose around explicit null and unset field handling.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch chore/undefined-aware-base-model

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

backend/aci/common/schemas/undefined_aware_base_model.py (2)
27-45: Consider using Pydantic 2.12's MISSING sentinel instead.

While the custom validator approach works, Pydantic 2.12 introduced a MISSING sentinel specifically to distinguish absent values from None. This provides a cleaner, more idiomatic solution without requiring custom validation logic.

Based on learnings

Example using MISSING:
from pydantic import BaseModel, MISSING, Field

class MyModel(BaseModel):
    # Fields that must be provided (if set) but can be None
    field1: str | None = MISSING
    
    # Fields that are optional with a default
    field2: int = 0
With this approach:

MISSING clearly indicates "no value provided"

None is an explicit value

No custom validator needed

model_fields_set still works as expected

Validation behavior is built-in and well-tested

This would simplify the base model significantly and align with Pydantic's intended usage patterns.

63-64: Address or track the TODO comment.

The question about intercepting __getattribute__() is valid. Direct field access via model.field bypasses the model_dump safeguards. Consider:

Whether this is a concern for the use cases

If needed, implementing __getattribute__ to warn/error on unset field access

Documenting the expected usage pattern to avoid this scenario

Would you like me to open a new issue to track this decision, or would you prefer to address it in this PR?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ceeb03 and e25f4cd.

📒 Files selected for processing (1)

backend/aci/common/schemas/undefined_aware_base_model.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

backend/aci/common/schemas/undefined_aware_base_model.py (1)

backend/aci/common/logging_setup.py (1)

get_logger (44-45)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: cubic · AI code reviewer

🔇 Additional comments (3)

backend/aci/common/schemas/undefined_aware_base_model.py (3)

1-9: LGTM!

Imports and logger setup are appropriate for the module's functionality.

11-14: LGTM!

The enum clearly defines the two supported behaviors for handling model_dump calls without exclude_unset.

47-61: Add or confirm Python ≥3.10 requirement
The match case syntax in model_dump (backend/aci/common/schemas/undefined_aware_base_model.py:47-61) requires Python 3.10+. No requires-python was found in pyproject.toml, setup.py, or setup.cfg—please update or verify the project’s interpreter requirement.

coderabbitai · 2025-10-08T17:38:27Z

+    _non_nullable_fields: list[str] = []
+    _dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = (
+        BehaviorOnDumpWithoutExcludeUnset.WARN
+    )


⚠️ Potential issue | 🔴 Critical

Use PrivateAttr() for internal configuration fields.

The fields _non_nullable_fields and _dump_without_exclude_unset_behavior are currently treated as regular Pydantic model fields, which means they:

Will be included in model_dump() output

Will be validated during instantiation

Will appear in the JSON schema

Since these are internal configuration attributes, they should use PrivateAttr() from Pydantic v2.

Additionally, the mutable default [] for _non_nullable_fields is a Python anti-pattern that could cause modifications in one instance to affect others.

Apply this diff to fix the issue:

+from pydantic import BaseModel, model_validator, PrivateAttr -from pydantic import BaseModel, model_validator class UndefinedAwareBaseModel(BaseModel): """ A base model that allows all fields to be nullable and use a custom validator to check for non-nullable fields. """ - _non_nullable_fields: list[str] = [] + _non_nullable_fields: list[str] = PrivateAttr(default_factory=list) - _dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = ( - BehaviorOnDumpWithoutExcludeUnset.WARN - ) + _dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = PrivateAttr( + default=BehaviorOnDumpWithoutExcludeUnset.WARN + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

_non_nullable_fields: list[str] = []

_dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = (

BehaviorOnDumpWithoutExcludeUnset.WARN

)

from pydantic import BaseModel, model_validator, PrivateAttr

class UndefinedAwareBaseModel(BaseModel):

"""

A base model that allows all fields to be nullable and use a custom validator to check for

non-nullable fields.

"""

_non_nullable_fields: list[str] = PrivateAttr(default_factory=list)

_dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = PrivateAttr(

default=BehaviorOnDumpWithoutExcludeUnset.WARN

)

# …rest of class…

🤖 Prompt for AI Agents

In backend/aci/common/schemas/undefined_aware_base_model.py around lines 22-25, the internal config attributes _non_nullable_fields and _dump_without_exclude_unset_behavior are currently regular Pydantic fields; change them to Pydantic PrivateAttr()s so they are not validated, serialized, or exposed in JSON schema. Specifically, replace the list default with PrivateAttr(default_factory=list) to avoid a shared mutable default, and set the enum config with PrivateAttr(default=BehaviorOnDumpWithoutExcludeUnset.WARN) so it remains an internal attribute.

coderabbitai · 2025-10-08T17:38:27Z

+        As there is no easy way to differentiate between "None" and "Undefined" with Pydantic.
+        We don't know whether caller do not provide a value for a field or want to explicitly set
+        it to None. We use a workaround as follow:
+        - We allow all fields to be nullable and default to None ewhen defining the pydantic model.


⚠️ Potential issue | 🟡 Minor

Fix typo in docstring.

"ewhen" should be "when".

- - We allow all fields to be nullable and default to None ewhen defining the pydantic model. + - We allow all fields to be nullable and default to None when defining the pydantic model.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- We allow all fields to be nullable and default to None ewhen defining the pydantic model.

- We allow all fields to be nullable and default to None when defining the pydantic model.

🤖 Prompt for AI Agents

In backend/aci/common/schemas/undefined_aware_base_model.py around line 33, the docstring contains a typo: "ewhen" should be "when"; update the docstring to replace "ewhen" with "when" so the sentence reads correctly.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (2)

backend/aci/common/schemas/undefined_aware_base_model.py (2)

22-25: Past review comment remains unaddressed: Use PrivateAttr() for internal configuration.

The internal fields _non_nullable_fields and _dump_without_exclude_unset_behavior are still defined as regular Pydantic fields instead of private attributes. This means they will be serialized, validated, and exposed in the JSON schema. Additionally, the mutable default [] can cause shared state bugs.

33-33: Past review comment remains unaddressed: Fix typo in docstring.

The typo "ewhen" should be "when" in the docstring.

🧹 Nitpick comments (2)

backend/aci/common/schemas/undefined_aware_base_model.py (2)
63-64: Resolve or track the incomplete TODO.

The TODO suggests that direct field access (e.g., model.field) may bypass the undefined-awareness protections. This could undermine the model's purpose if users access fields directly rather than through model_dump(exclude_unset=True).

Would you like me to:

Generate code to intercept __getattribute__ with proper undefined handling?

Open a new issue to track this as a follow-up enhancement?

Document in the class docstring that users must use model_dump(exclude_unset=True) or check model_fields_set rather than direct access?

16-61: Consider Pydantic 2.12's MISSING sentinel for cleaner undefined handling.

Pydantic 2.12 introduced a native MISSING sentinel specifically to distinguish "no value" from None, which aligns with your use case. This could eliminate the need for the workaround approach (making all fields nullable with custom validation).

Based on learnings.

Example pattern with MISSING:
from pydantic import BaseModel, MISSING

class MyModel(BaseModel):
    field: int | None = MISSING  # absent by default
    
# field not in instance.model_fields_set -> value is MISSING
# field explicitly set to None -> value is None
# field set to 5 -> value is 5
This approach would:

Eliminate the need for _non_nullable_fields tracking

Remove the custom validator complexity

Provide clearer semantics for API consumers

Work naturally with model_dump(exclude_unset=True)

If adopting this, you could still keep the model_dump override to enforce exclude_unset usage.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e25f4cd and 8be703e.

📒 Files selected for processing (1)

backend/aci/common/schemas/undefined_aware_base_model.py (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

backend/aci/common/schemas/undefined_aware_base_model.py (1)

backend/aci/common/logging_setup.py (1)

get_logger (44-45)

coderabbitai · 2025-10-08T17:42:07Z

+    _non_nullable_fields: list[str] = []
+    _dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = (
+        BehaviorOnDumpWithoutExcludeUnset.WARN
+    )
+
+    @model_validator(mode="after")
+    def validate_non_nullable_fields(self) -> "UndefinedAwareBaseModel":
+        """
+        As there is no easy way to differentiate between "None" and "Undefined" with Pydantic.
+        We don't know whether caller do not provide a value for a field or want to explicitly set
+        it to None. We use a workaround as follow:
+        - We allow all fields to be nullable and default to None ewhen defining the pydantic model.
+        - We use a custom validator to check for non-nullable fields.
+        - Only when caller provided a value, field name will be in `model.model_fields_set`.
+        - When updating to database, we either check `model.model_fields_set` or use
+          `model_dump(exclude_unset=True)` to exclude unset fields.
+        - If model_dump is called without exclude_unset, we console warn (default) or raise error.
+        """
+
+        non_nullable_fields = self._non_nullable_fields
+        for field in self.model_fields_set:
+            if field in non_nullable_fields and getattr(self, field) is None:
+                raise ValueError(f"{field} cannot be None if it is provided.")
+        return self


⚠️ Potential issue | 🟠 Major

Unclear how subclasses populate _non_nullable_fields.

The validation logic relies on _non_nullable_fields, but there's no documented or type-safe mechanism for subclasses to declare which fields should be non-nullable. Users may not understand how to properly configure this base model.

Consider one of these approaches:

Class-level configuration using ConfigDict (if feasible with your design):

class ConfigDict: non_nullable_fields: list[str] = []

Document the expected usage pattern in the docstring:

class MyModel(UndefinedAwareBaseModel): field1: str | None = None field2: int | None = None _non_nullable_fields: list[str] = PrivateAttr(default_factory=lambda: ["field1"])

Consider leveraging Pydantic 2.12's new MISSING sentinel (based on learnings), which provides a native way to distinguish undefined from None without requiring this workaround.

coderabbitai · 2025-10-08T17:42:07Z

+                    raise SyntaxError(
+                        "model_dump is called without providing `exclude_unset` args. This may "
+                        "accidentally include unset fields."
+                    )


⚠️ Potential issue | 🔴 Critical

Use ValueError instead of SyntaxError.

SyntaxError is semantically incorrect for runtime validation errors—it's reserved for syntax parsing failures. Use ValueError or define a custom exception.

Apply this diff:

- raise SyntaxError( + raise ValueError( "model_dump is called without providing `exclude_unset` args. This may " "accidentally include unset fields." )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

raise SyntaxError(

"model_dump is called without providing `exclude_unset` args. This may "

"accidentally include unset fields."

)

raise ValueError(

"model_dump is called without providing `exclude_unset` args. This may "

"accidentally include unset fields."

)

🤖 Prompt for AI Agents

In backend/aci/common/schemas/undefined_aware_base_model.py around lines 57 to 60, the code currently raises SyntaxError when model_dump is called without exclude_unset; replace this with ValueError (or a project-specific custom exception) to correctly represent a runtime validation error, updating the raise statement message to use ValueError(...) and keeping the existing message text unchanged.

cubic-dev-ai

No issues found across 1 file

kelvin-aipolabs added 2 commits October 8, 2025 18:29

chore: adding undefined aware base model

8a2fd0d

add behavior choice

e25f4cd

kelvin-aipolabs requested a review from hanyixxx October 8, 2025 17:32

change to raising syntax error instead

8be703e

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

vercel bot deployed to Preview October 8, 2025 17:39 View deployment

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

cubic-dev-ai bot reviewed Oct 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chore/undefined aware base model#159

Chore/undefined aware base model#159
kelvin-aipolabs wants to merge 3 commits intomainfrom
chore/undefined-aware-base-model

kelvin-aipolabs commented Oct 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

vercel bot commented Oct 8, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Oct 8, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 8, 2025

Uh oh!

coderabbitai bot Oct 8, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 8, 2025

Uh oh!

coderabbitai bot Oct 8, 2025

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-    _non_nullable_fields: list[str] = []
-    _dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = (
-        BehaviorOnDumpWithoutExcludeUnset.WARN
-    )
+from pydantic import BaseModel, model_validator, PrivateAttr
+class UndefinedAwareBaseModel(BaseModel):
+    """
+    A base model that allows all fields to be nullable and use a custom validator to check for
+    non-nullable fields.
+    """
+    _non_nullable_fields: list[str] = PrivateAttr(default_factory=list)
+    _dump_without_exclude_unset_behavior: BehaviorOnDumpWithoutExcludeUnset = PrivateAttr(
+        default=BehaviorOnDumpWithoutExcludeUnset.WARN
+    )
+    # …rest of class…

	- We allow all fields to be nullable and default to None ewhen defining the pydantic model.
	- We allow all fields to be nullable and default to None when defining the pydantic model.

Conversation

kelvin-aipolabs commented Oct 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Description

Summary by cubic

Summary by CodeRabbit

Uh oh!

vercel bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 8, 2025

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

kelvin-aipolabs commented Oct 8, 2025 •

edited by coderabbitai bot

Loading

vercel bot commented Oct 8, 2025 •

edited

Loading

coderabbitai bot commented Oct 8, 2025 •

edited

Loading