Skip to content

feat: Add matillion data cloud support#26773

Open
ulixius9 wants to merge 8 commits intomainfrom
matillion_dpc
Open

feat: Add matillion data cloud support#26773
ulixius9 wants to merge 8 commits intomainfrom
matillion_dpc

Conversation

@ulixius9
Copy link
Member

@ulixius9 ulixius9 commented Mar 25, 2026

Describe your changes:

Fixes

I worked on ... because ...

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Summary by Gitar

  • New auth configuration:
    • Added MatillionDPCAuth schema with OAuth2 and PAT support in matillionDPC.json
    • Updated MatillionConnectionClassConverter to handle both MatillionETLAuth and MatillionDPCAuth types
  • New feature:
    • Added lineageLookbackDays parameter (1–365 days, default 30) to MatillionConnection for OpenLineage API queries
  • Test coverage:
    • Added MatillionConnectionClassConverterTest with ETL/DPC conversion and null-safety tests
    • Updated ClassConverterFactoryTest to include MatillionConnection
  • Generated code updates:
    • Updated TypeScript and Java API models across 12+ files with MatillionDPC type and Region enum

This will update automatically on new commits.

Copilot AI review requested due to automatic review settings March 25, 2026 18:05
@ulixius9 ulixius9 self-assigned this Mar 25, 2026
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Mar 25, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds schema and backend secrets-conversion support for connecting to Matillion Data Productivity Cloud (DPC) alongside the existing Matillion ETL connection type.

Changes:

  • Extend MatillionConnection to allow a new MatillionDPC auth config and introduce lineageLookbackDays.
  • Add a new JSON schema defining the Matillion DPC authentication configuration (OAuth2 client credentials or PAT).
  • Update the backend MatillionConnectionClassConverter to convert/mask/unmask secrets for the new DPC auth type.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/matillionConnection.json Adds MatillionDPC as a supported connection option and introduces lineageLookbackDays for DPC OpenLineage lookback.
openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/matillion/matillionDPC.json New schema defining Matillion DPC auth fields (client credentials, PAT, region).
openmetadata-service/src/main/java/org/openmetadata/service/secrets/converter/MatillionConnectionClassConverter.java Ensures secrets conversion supports both MatillionETLAuth and the new MatillionDPCAuth.

"format": "password"
}
},
"required": [],
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatillionDPC auth config currently has an empty required list, which means an empty object (or one with only region) can pass schema validation. This will allow invalid connections to be saved and then fail later at runtime. Consider enforcing that either personalAccessToken is provided, or both clientId and clientSecret are provided (and ideally disallow providing both auth methods at once).

Suggested change
"required": [],
"required": [],
"anyOf": [
{
"required": [
"personalAccessToken"
]
},
{
"required": [
"clientId",
"clientSecret"
]
}
],

Copilot uses AI. Check for mistakes.
Comment on lines 28 to +49
@@ -36,10 +39,18 @@
"$ref": "../../../../type/filterPattern.json#/definitions/filterPattern",
"title": "Default Pipeline Filter Pattern"
},
"lineageLookbackDays": {
"title": "Lineage Lookback Days",
"description": "Number of days to look back when fetching lineage events from Matillion DPC OpenLineage API.",
"type": "integer",
"default": 30,
"minimum": 1,
"maximum": 365
},
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New schema fields (MatillionDPC auth option and lineageLookbackDays) aren’t covered by the existing configuration parsing tests for Matillion. Adding a unit test that parses a valid DPC config (PAT and/or client credentials) and asserts validation failures for missing credentials / out-of-range lineageLookbackDays would prevent regressions in schema-to-model generation and validation behavior.

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

✅ TypeScript Types Auto-Updated

The generated TypeScript types have been automatically updated based on JSON schema changes in this PR.

@github-actions github-actions bot requested a review from a team as a code owner March 25, 2026 18:11
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

🟡 Playwright Results — all passed (8 flaky)

✅ 2799 passed · ❌ 0 failed · 🟡 8 flaky · ⏭️ 189 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 452 0 3 2
🟡 Shard 2 602 0 2 32
🟡 Shard 4 601 0 2 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 557 0 1 41
🟡 8 flaky test(s) (passed on retry)
  • Features/CustomizeDetailPage.spec.ts › Database Schema - customization should work (shard 1, 1 retry)
  • Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/Glossary/GlossaryWorkflow.spec.ts › should display correct status badge color and icon (shard 2, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Tag Add, Update and Remove (shard 4, 1 retry)
  • Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

harshsoni2024
harshsoni2024 previously approved these changes Mar 26, 2026
Copilot AI review requested due to automatic review settings March 26, 2026 06:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 15 changed files in this pull request and generated 1 comment.

Comment on lines +35 to +36
matillionConnection.getConnection(),
List.of(MatillionETLAuth.class, MatillionDPCAuth.class))
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation in this multi-line tryToConvertOrFail(...) call doesn’t match the project’s standard Java formatting (and is likely to be rewritten by Spotless / google-java-format). Please reformat this block (e.g., by running mvn spotless:apply) so the continuation indentation is consistent and CI formatting checks don’t fail.

Suggested change
matillionConnection.getConnection(),
List.of(MatillionETLAuth.class, MatillionDPCAuth.class))
matillionConnection.getConnection(),
List.of(MatillionETLAuth.class, MatillionDPCAuth.class))

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

OpenMetadata Service New-Code Coverage

PASS. Required changed-line coverage: 90.00% overall and per touched production file.

  • Overall executable changed lines: 3/3 covered (100.00%)
  • Missed executable changed lines: 0
  • Non-executable changed lines ignored by JaCoCo: 1
  • Changed production files: 1
File Covered Missed Executable Non-exec Coverage Uncovered lines
openmetadata-service/src/main/java/org/openmetadata/service/secrets/converter/MatillionConnectionClassConverter.java 3 0 3 1 100.00% -

Only changed executable lines under openmetadata-service/src/main/java are counted. Test files, comments, imports, and non-executable lines are excluded.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

Jest test Coverage

UI tests summary

Lines Statements Branches Functions
Coverage: 64%
64.83% (58135/89664) 44.66% (30725/68793) 47.67% (9200/19299)

Comment on lines +42 to +49
"lineageLookbackDays": {
"title": "Lineage Lookback Days",
"description": "Number of days to look back when fetching lineage events from Matillion DPC OpenLineage API.",
"type": "integer",
"default": 30,
"minimum": 1,
"maximum": 365
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Quality: lineageLookbackDays is DPC-specific but on shared connection

The lineageLookbackDays property is placed at the top-level matillionConnection.json schema, meaning it will be visible for both ETL and DPC connection types. However, its description explicitly says it's for the "Matillion DPC OpenLineage API." If ETL connections don't use lineage lookback, this could confuse users configuring an ETL connection. Consider either:

  1. Moving it inside the matillionDPC.json schema, or
  2. Updating the description to be generic if it applies to both types.

Suggested fix:

Either move lineageLookbackDays into matillionDPC.json, or update the description to not mention DPC specifically if it applies to both connection types.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 18 changed files in this pull request and generated 1 comment.

Copilot AI review requested due to automatic review settings March 26, 2026 13:23
@gitar-bot
Copy link

gitar-bot bot commented Mar 26, 2026

Code Review 👍 Approved with suggestions 1 resolved / 2 findings

Adds Matillion Data Cloud support with ingestion integration. Consider moving the DPC-specific lineageLookbackDays property from the shared connection to the service configuration to improve separation of concerns.

💡 Quality: lineageLookbackDays is DPC-specific but on shared connection

📄 openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/matillionConnection.json:42-49

The lineageLookbackDays property is placed at the top-level matillionConnection.json schema, meaning it will be visible for both ETL and DPC connection types. However, its description explicitly says it's for the "Matillion DPC OpenLineage API." If ETL connections don't use lineage lookback, this could confuse users configuring an ETL connection. Consider either:

  1. Moving it inside the matillionDPC.json schema, or
  2. Updating the description to be generic if it applies to both types.
Suggested fix
Either move lineageLookbackDays into matillionDPC.json, or update the description to not mention DPC specifically if it applies to both connection types.
✅ 1 resolved
Edge Case: MatillionDPC schema has no required fields, allowing empty config

📄 openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/matillion/matillionDPC.json:44
The matillionDPC.json schema defines "required": [], meaning a user can create a Matillion DPC connection with no credentials at all — no clientId/clientSecret and no personalAccessToken. This contrasts with the ETL schema which requires hostPort, username, and password.

Since DPC supports two auth methods (OAuth2 client credentials vs. personal access token), at minimum one set should be required. Without any validation, a user could save an empty DPC config and only discover the problem at runtime during ingestion.

Consider using a oneOf to enforce that either OAuth2 credentials or PAT are provided, or at minimum require the type field so the oneOf discriminator in the parent schema can reliably distinguish between ETL and DPC connections.

🤖 Prompt for agents
Code Review: Adds Matillion Data Cloud support with ingestion integration. Consider moving the DPC-specific `lineageLookbackDays` property from the shared connection to the service configuration to improve separation of concerns.

1. 💡 Quality: lineageLookbackDays is DPC-specific but on shared connection
   Files: openmetadata-spec/src/main/resources/json/schema/entity/services/connections/pipeline/matillionConnection.json:42-49

   The `lineageLookbackDays` property is placed at the top-level `matillionConnection.json` schema, meaning it will be visible for both ETL and DPC connection types. However, its description explicitly says it's for the "Matillion DPC OpenLineage API." If ETL connections don't use lineage lookback, this could confuse users configuring an ETL connection. Consider either:
   1. Moving it inside the `matillionDPC.json` schema, or
   2. Updating the description to be generic if it applies to both types.

   Suggested fix:
   Either move lineageLookbackDays into matillionDPC.json, or update the description to not mention DPC specifically if it applies to both connection types.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 18 changed files in this pull request and generated 1 comment.

Comment on lines 406 to +412
self.assertIn(
"We encountered an error parsing the configuration of your MatillionConnection.\nYou might need to review your config based on the original cause of this failure:\n\t - Missing parameter in ('connection', 'hostPort')\n\t - Missing parameter in ('connection', 'username')\n\t - Missing parameter in ('connection', 'password')",
"We encountered an error parsing the configuration of your MatillionConnection.\n"
"You might need to review your config based on the original cause of this failure:\n"
"\t - Missing parameter in ('connection', 'function-after[parse_name(), MatillionEtlAuthConfig]', 'hostPort')\n"
"\t - Missing parameter in ('connection', 'function-after[parse_name(), MatillionEtlAuthConfig]', 'username')\n"
"\t - Missing parameter in ('connection', 'function-after[parse_name(), MatillionEtlAuthConfig]', 'password')\n"
"\t - Invalid parameter value for ('connection', 'function-after[parse_name(), MatillionDpcAuthConfig]', 'type')",
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion depends on Pydantic's internal error loc formatting (e.g., function-after[parse_name(), ...]) and on the exact ordering of validation errors, which is brittle across Pydantic/model changes. Prefer asserting on smaller, stable substrings (e.g., that the message mentions MatillionConnection and that hostPort/username/password are missing) rather than the full, fully-qualified loc tuples.

Copilot uses AI. Check for mistakes.
@sonarqubecloud
Copy link

@sonarqubecloud
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants