Skip to content

ISSUE #26706 - Migrate Failed sample rows to OS#26780

Open
TeddyCr wants to merge 19 commits intoopen-metadata:mainfrom
TeddyCr:ISSUE-26706
Open

ISSUE #26706 - Migrate Failed sample rows to OS#26780
TeddyCr wants to merge 19 commits intoopen-metadata:mainfrom
TeddyCr:ISSUE-26706

Conversation

@TeddyCr
Copy link
Collaborator

@TeddyCr TeddyCr commented Mar 25, 2026

Describe your changes:

Fixes #26706

Migrate failed sample rows and inspection to OS

Screen.Recording.2026-03-25.at.3.01.49.PM.mov

Type of change:

  • Bug fix
  • Improvement
  • New feature
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation

Checklist:

  • I have read the CONTRIBUTING document.
  • My PR title is Fixes <issue-number>: <short explanation>
  • I have commented on my code, particularly in hard-to-understand areas.
  • For JSON Schema changes: I updated the migration scripts or explained why it is not needed.

Improvement

  • I have added tests around the new logic.
  • For connector/ingestion changes: I updated the documentation.

@TeddyCr TeddyCr requested review from a team as code owners March 25, 2026 22:11
Copilot AI review requested due to automatic review settings March 25, 2026 22:11
@github-actions github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Mar 25, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion-base-slim:trivy (debian 12.13)

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (38)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (15)

Package Vulnerability ID Severity Installed Version Fixed Version
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.5 3.1.8
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/extended_sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/lineage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_data_aut.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.json

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage.yaml

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /ingestion/pipelines/sample_usage_aut.yaml

No Vulnerabilities Found

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Migrates failed-row sample data + inspection SQL query support into the OSS ingestion + UI incident manager flow, including persistence via REST sink and new UI surfaces to view/manage this data.

Changes:

  • Ingestion: introduce mixins to collect failed-row samples/inspection query during validation and ingest them via the REST sink; extend result model to carry this data.
  • UI: add TestCase “SQL Query” tab and “Failed sample data” panel with view/delete actions; add REST helpers and i18n keys.
  • Tests: add Playwright coverage for the UI sample-data behavior and Python unit/integration tests for sampling + ingestion.

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
openmetadata-ui/src/main/resources/ui/src/rest/testAPI.ts Adds REST helpers to GET/DELETE failed-rows sample data for a test case.
openmetadata-ui/src/main/resources/ui/src/pages/IncidentManager/IncidentManagerDetailPage/TestCaseClassBase.ts Adds a conditional “SQL Query” tab and requests inspectionQuery in test case fetch fields.
openmetadata-ui/src/main/resources/ui/src/locale/languages/en-us.json Adds new i18n strings for SQL/query/sample-data actions.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/TestCaseResultTab/TestCaseResultTabClassBase.ts Registers the failed-sample-data panel as an additional component in the result tab.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/SqlQueryTab/SqlQueryTab.component.tsx New SQL Query tab showing inspection query and version diffs; opens “Add to Table” modal.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/SqlQueryTab/AddSqlQueryFormModal/AddSqlQueryFormModal.interface.ts Props interface for the “Add SQL Query” modal.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/SqlQueryTab/AddSqlQueryFormModal/AddSqlQueryFormModal.component.tsx Modal to create a Query entity from the inspection query and link it to the table.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/FailedTestCaseSampleData/failed-test-case-sample-data.less Styling for highlighting failed-column cells and diff row types.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/FailedTestCaseSampleData/FailedTestCaseSampleData.interface.ts Props interface for failed-sample-data panel.
openmetadata-ui/src/main/resources/ui/src/components/DataQuality/IncidentManager/FailedTestCaseSampleData/FailedTestCaseSampleData.component.tsx UI to display failed-row samples, link to SQL tab, and delete samples.
openmetadata-ui/src/main/resources/ui/playwright/e2e/Features/FailedTestCaseSampleData.spec.ts E2E test for highlighting failed sample data and deleting it.
ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py Unit tests for new sampling mixins and orchestration logic.
ingestion/tests/integration/data_quality/test_failed_row_samples.py Integration tests asserting failed-row samples are published for failing tests only.
ingestion/src/metadata/ingestion/sink/metadata_rest.py Ingests failed-row samples + inspection query after writing test results.
ingestion/src/metadata/ingestion/ometa/mixins/tests_mixin.py Adds client method to retrieve failed-row samples from the API.
ingestion/src/metadata/data_quality/validations/table/sqlalchemy/tableCustomSQLQuery.py Captures inspection query + failed-row sample for custom SQL test validator.
ingestion/src/metadata/data_quality/validations/mixins/failed_sample_validator_mixin.py New mixin to attach failed rows sample + inspection query to result response.
ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py New SQLAlchemy/Pandas samplers for fetching failed-row samples.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToNotMatchRegex.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToMatchRegex.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToBeUnique.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToBeNotNull.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToBeNotInSet.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToBeInSet.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValuesToBeBetween.py Adds failed-row sampling support for this validator (incl. datetime handling).
ingestion/src/metadata/data_quality/validations/column/sqlalchemy/columnValueLengthsToBeBetween.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToNotMatchRegex.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToMatchRegex.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToBeUnique.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToBeNotNull.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToBeNotInSet.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToBeInSet.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/column/pandas/columnValuesToBeBetween.py Adds failed-row sampling support for this validator (incl. datetime handling).
ingestion/src/metadata/data_quality/validations/column/pandas/columnValueLengthsToBeBetween.py Adds failed-row sampling support for this validator.
ingestion/src/metadata/data_quality/validations/base_test_handler.py Changes validator return type to TestCaseResultResponse and adds sampling hook.
ingestion/src/metadata/data_quality/runner/core.py Propagates TestCaseResultResponse from interface/validator without re-wrapping.
ingestion/src/metadata/data_quality/interface/test_suite_interface.py Updates interface to return TestCaseResultResponse and wraps aborted results.
ingestion/src/metadata/data_quality/api/models.py Extends TestCaseResultResponse with failedRowsSample, inspectionQuery, validateColumns.
Comments suppressed due to low confidence (1)

ingestion/src/metadata/data_quality/validations/base_test_handler.py:163

  • run_validation() is annotated to return TestCaseResultResponse, but when are_dimension_columns_valid() is false it returns test_result (a TestCaseResult). This will break callers that expect .testCase/.testCaseResult on the response (e.g., the runner/sink). Wrap and return a TestCaseResultResponse consistently for all branches, and update the docstring return type accordingly.
    def run_validation(self) -> TestCaseResultResponse:
        """Template method defining the validation flow with optional dimensional analysis

        This method orchestrates the overall validation process:
        1. Execute the main validation logic (overall results)
        2. Add dimensional results if configured

        Child classes can override this method to provide custom validation logic.
        If not overridden, this template method provides the default dimensional behavior.

        Returns:
            TestCaseResult: The test case result with optional dimensional results
        """
        # Execute the main validation logic (overall results)
        test_result = self._run_validation()

        # Add dimensional results if configured
        if self.is_dimensional_test():
            logger.debug(
                f"Executing dimensional validation for test case: {self.test_case.fullyQualifiedName}"
            )
            logger.debug(f"Dimension columns: {self.test_case.dimensionColumns}")

            if not self.are_dimension_columns_valid():
                return test_result

'not-equal-sample-data': type === DIFF_TYPE_VALUES.NOT_EQUAL,
});
}}
rowKey="name"
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table uses rowKey="name", but the rows are constructed as a Record<column, value> from TableData.columns and may not include a name property. This can lead to non-unique/undefined row keys and unstable rendering. Use a guaranteed key (e.g., add an index/id field when building rows, or use a rowKey function based on row index or a composite of column values).

Suggested change
rowKey="name"
rowKey={(_record, index) => index?.toString() ?? ''}

Copilot uses AI. Check for mistakes.
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

🛡️ TRIVY SCAN RESULT 🛡️

Target: openmetadata-ingestion:trivy (debian 12.12)

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
libpam-modules CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-modules-bin CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam-runtime CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2
libpam0g CVE-2025-6020 🚨 HIGH 1.5.2-6+deb12u1 1.5.2-6+deb12u2

🛡️ TRIVY SCAN RESULT 🛡️

Target: Java

Vulnerabilities (39)

Package Vulnerability ID Severity Installed Version Fixed Version
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.12.7 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.12.7 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core CVE-2025-52999 🚨 HIGH 2.13.4 2.15.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.13.4 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.15.2 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-core GHSA-72hv-8253-57qq 🚨 HIGH 2.16.1 2.18.6, 2.21.1, 3.1.0
com.fasterxml.jackson.core:jackson-databind CVE-2022-42003 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4.2
com.fasterxml.jackson.core:jackson-databind CVE-2022-42004 🚨 HIGH 2.12.7 2.12.7.1, 2.13.4
com.google.code.gson:gson CVE-2022-25647 🚨 HIGH 2.2.4 2.8.9
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.3.0 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.3.0 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.3.0 3.25.5, 4.27.5, 4.28.2
com.google.protobuf:protobuf-java CVE-2021-22569 🚨 HIGH 3.7.1 3.16.1, 3.18.2, 3.19.2
com.google.protobuf:protobuf-java CVE-2022-3509 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2022-3510 🚨 HIGH 3.7.1 3.16.3, 3.19.6, 3.20.3, 3.21.7
com.google.protobuf:protobuf-java CVE-2024-7254 🚨 HIGH 3.7.1 3.25.5, 4.27.5, 4.28.2
com.nimbusds:nimbus-jose-jwt CVE-2023-52428 🚨 HIGH 9.8.1 9.37.2
com.squareup.okhttp3:okhttp CVE-2021-0341 🚨 HIGH 3.12.12 4.9.2
commons-beanutils:commons-beanutils CVE-2025-48734 🚨 HIGH 1.9.4 1.11.0
commons-io:commons-io CVE-2024-47554 🚨 HIGH 2.8.0 2.14.0
dnsjava:dnsjava CVE-2024-25638 🚨 HIGH 2.1.7 3.6.0
io.airlift:aircompressor CVE-2025-67721 🚨 HIGH 0.27 2.0.3
io.netty:netty-codec-http2 CVE-2025-55163 🚨 HIGH 4.1.96.Final 4.2.4.Final, 4.1.124.Final
io.netty:netty-codec-http2 GHSA-xpw8-rcwv-8f8p 🚨 HIGH 4.1.96.Final 4.1.100.Final
io.netty:netty-handler CVE-2025-24970 🚨 HIGH 4.1.96.Final 4.1.118.Final
net.minidev:json-smart CVE-2021-31684 🚨 HIGH 1.3.2 1.3.3, 2.4.4
net.minidev:json-smart CVE-2023-1370 🚨 HIGH 1.3.2 2.4.9
org.apache.avro:avro CVE-2024-47561 🔥 CRITICAL 1.7.7 1.11.4
org.apache.avro:avro CVE-2023-39410 🚨 HIGH 1.7.7 1.11.3
org.apache.derby:derby CVE-2022-46337 🔥 CRITICAL 10.14.2.0 10.14.3, 10.15.2.1, 10.16.1.2, 10.17.1.0
org.apache.ivy:ivy CVE-2022-46751 🚨 HIGH 2.5.1 2.5.2
org.apache.mesos:mesos CVE-2018-1330 🚨 HIGH 1.4.3 1.6.0
org.apache.spark:spark-core_2.12 CVE-2025-54920 🚨 HIGH 3.5.6 3.5.7
org.apache.thrift:libthrift CVE-2019-0205 🚨 HIGH 0.12.0 0.13.0
org.apache.thrift:libthrift CVE-2020-13949 🚨 HIGH 0.12.0 0.14.0
org.apache.zookeeper:zookeeper CVE-2023-44981 🔥 CRITICAL 3.6.3 3.7.2, 3.8.3, 3.9.1
org.eclipse.jetty:jetty-server CVE-2024-13009 🚨 HIGH 9.4.56.v20240826 9.4.57.v20241219
org.lz4:lz4-java CVE-2025-12183 🚨 HIGH 1.8.0 1.8.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: Node.js

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: Python

Vulnerabilities (33)

Package Vulnerability ID Severity Installed Version Fixed Version
Authlib CVE-2026-27962 🔥 CRITICAL 1.6.6 1.6.9
Authlib CVE-2026-28490 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28498 🚨 HIGH 1.6.6 1.6.9
Authlib CVE-2026-28802 🚨 HIGH 1.6.6 1.6.7
PyJWT CVE-2026-32597 🚨 HIGH 2.10.1 2.12.0
Werkzeug CVE-2024-34069 🚨 HIGH 2.2.3 3.0.3
aiohttp CVE-2025-69223 🚨 HIGH 3.12.12 3.13.3
aiohttp CVE-2025-69223 🚨 HIGH 3.13.2 3.13.3
apache-airflow CVE-2025-68438 🚨 HIGH 3.1.5 3.1.6
apache-airflow CVE-2025-68675 🚨 HIGH 3.1.5 3.1.6, 2.11.1
apache-airflow CVE-2026-26929 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-28779 🚨 HIGH 3.1.5 3.1.8
apache-airflow CVE-2026-30911 🚨 HIGH 3.1.5 3.1.8
apache-airflow-providers-http CVE-2025-69219 🚨 HIGH 5.6.0 6.0.0
azure-core CVE-2026-21226 🚨 HIGH 1.37.0 1.38.0
cryptography CVE-2026-26007 🚨 HIGH 42.0.8 46.0.5
google-cloud-aiplatform CVE-2026-2472 🚨 HIGH 1.130.0 1.131.0
google-cloud-aiplatform CVE-2026-2473 🚨 HIGH 1.130.0 1.133.0
jaraco.context CVE-2026-23949 🚨 HIGH 5.3.0 6.1.0
jaraco.context CVE-2026-23949 🚨 HIGH 6.0.1 6.1.0
protobuf CVE-2026-0994 🚨 HIGH 4.25.8 6.33.5, 5.29.6
pyOpenSSL CVE-2026-27459 🚨 HIGH 24.1.0 26.0.0
pyasn1 CVE-2026-23490 🚨 HIGH 0.6.1 0.6.2
pyasn1 CVE-2026-30922 🚨 HIGH 0.6.1 0.6.3
python-multipart CVE-2026-24486 🚨 HIGH 0.0.20 0.0.22
ray CVE-2025-62593 🔥 CRITICAL 2.47.1 2.52.0
starlette CVE-2025-62727 🚨 HIGH 0.48.0 0.49.1
tornado CVE-2026-31958 🚨 HIGH 6.5.3 6.5.5
urllib3 CVE-2025-66418 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2025-66471 🚨 HIGH 1.26.20 2.6.0
urllib3 CVE-2026-21441 🚨 HIGH 1.26.20 2.6.3
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2
wheel CVE-2026-24049 🚨 HIGH 0.45.1 0.46.2

🛡️ TRIVY SCAN RESULT 🛡️

Target: usr/bin/docker

Vulnerabilities (4)

Package Vulnerability ID Severity Installed Version Fixed Version
stdlib CVE-2025-68121 🔥 CRITICAL v1.25.5 1.24.13, 1.25.7, 1.26.0-rc.3
stdlib CVE-2025-61726 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2025-61728 🚨 HIGH v1.25.5 1.24.12, 1.25.6
stdlib CVE-2026-25679 🚨 HIGH v1.25.5 1.25.8, 1.26.1

🛡️ TRIVY SCAN RESULT 🛡️

Target: /etc/ssl/private/ssl-cert-snakeoil.key

No Vulnerabilities Found

🛡️ TRIVY SCAN RESULT 🛡️

Target: /home/airflow/openmetadata-airflow-apis/openmetadata_managed_apis.egg-info/PKG-INFO

No Vulnerabilities Found

…row_sampler_mixin.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 25, 2026 23:03
@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

🟡 Playwright Results — all passed (33 flaky)

✅ 3384 passed · ❌ 0 failed · 🟡 33 flaky · ⏭️ 216 skipped

Shard Passed Failed Flaky Skipped
🟡 Shard 1 447 0 8 2
🟡 Shard 2 601 0 4 32
🟡 Shard 3 602 0 7 27
🟡 Shard 4 596 0 7 47
✅ Shard 5 587 0 0 67
🟡 Shard 6 551 0 7 41
🟡 33 flaky test(s) (passed on retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Database entity item action after rules disabled (shard 1, 1 retry)
  • Features/DataAssetRulesDisabled.spec.ts › Verify the Database Schema entity item action after rules disabled (shard 1, 1 retry)
  • Features/CustomizeDetailPage.spec.ts › Database - customization should work (shard 1, 1 retry)
  • Features/CustomizeDetailPage.spec.ts › Data Product - customization should work (shard 1, 1 retry)
  • Features/NavigationBlocker.spec.ts › should stay on current page and keep changes when X button is clicked (shard 1, 1 retry)
  • Flow/Metric.spec.ts › Verify Related Metrics Update (shard 1, 1 retry)
  • Flow/Tour.spec.ts › Tour should work from URL directly (shard 1, 1 retry)
  • Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
  • Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
  • Features/DataQuality/TestCaseImportExportBasic.spec.ts › User with ViewAll on TEST_CASE resource can successfully export test cases (shard 2, 1 retry)
  • Features/DataQuality/TestCaseIncidentPermissions.spec.ts › User with only VIEW cannot PATCH incidents (shard 2, 1 retry)
  • Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 2, 1 retry)
  • Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
  • Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
  • Features/RTL.spec.ts › Verify Following widget functionality (shard 3, 1 retry)
  • Features/TestSuiteMultiPipeline.spec.ts › TestSuite multi pipeline support (shard 3, 1 retry)
  • Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
  • Flow/PersonaFlow.spec.ts › Set default persona for team should work properly (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › Entity Reference List (shard 3, 1 retry)
  • Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
  • Pages/DataContracts.spec.ts › Create Data Contract and validate for Topic (shard 4, 1 retry)
  • Pages/DataContractsSemanticRules.spec.ts › Validate Description Rule Is_Set (shard 4, 1 retry)
  • Pages/DomainAdvanced.spec.ts › User with domain access can view subdomains (shard 4, 1 retry)
  • Pages/Domains.spec.ts › Verify redirect path on data product delete (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Delete Container (shard 4, 1 retry)
  • Pages/Entity.spec.ts › Certification Add Remove (shard 4, 1 retry)
  • Pages/ExploreTree.spec.ts › Verify Database and Database Schema available in explore tree (shard 6, 1 retry)
  • Pages/Glossary.spec.ts › Column dropdown drag-and-drop functionality for Glossary Terms table (shard 6, 2 retries)
  • Pages/Glossary.spec.ts › Create glossary, change language to Dutch, and delete glossary (shard 6, 1 retry)
  • Pages/Login.spec.ts › Refresh should work (shard 6, 1 retry)
  • ... and 3 more

📦 Download artifacts

How to debug locally
# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

❌ Playwright Lint Check Failed — ESLint + Prettier + Organise Imports

The following files have style issues that need to be fixed:
playwright/e2e/Features/FailedTestCaseSampleData.spec.ts

Fix locally (fast — changed files only):

cd openmetadata-ui/src/main/resources/ui
yarn ui-checkstyle:playwright:changed

Or to fix all playwright files: yarn ui-checkstyle:playwright

@github-actions
Copy link
Contributor

github-actions bot commented Mar 26, 2026

❌ Lint Check Failed — ESLint + Prettier + Organise Imports (src)

The following files have style issues that need to be fixed:
src/components/DataQuality/IncidentManager/FailedTestCaseSampleData/FailedTestCaseSampleData.component.tsx src/components/DataQuality/IncidentManager/FailedTestCaseSampleData/FailedTestCaseSampleData.interface.ts src/components/DataQuality/IncidentManager/SqlQueryTab/AddSqlQueryFormModal/AddSqlQueryFormModal.component.tsx src/components/DataQuality/IncidentManager/SqlQueryTab/AddSqlQueryFormModal/AddSqlQueryFormModal.interface.ts src/components/DataQuality/IncidentManager/SqlQueryTab/SqlQueryTab.component.tsx src/components/ActivityFeed/FeedEditor/FeedEditor.tsx src/components/BlockEditor/Extensions/Callout/Callout.ts src/components/DataAssets/DataAssetsHeader/DataAssetsHeader.component.tsx src/components/DataAssets/DataAssetsHeader/DataAssetsHeader.test.tsx src/components/DataProducts/DataProductsDetailsPage/DataProductsDetailsPage.component.tsx src/components/DataQuality/AddDataQualityTest/components/AddTestSuitePipeline.tsx src/components/DataQuality/AddDataQualityTest/components/TestCaseFormV1.tsx src/components/DataQuality/AddTestCaseList/AddTestCaseList.component.test.tsx src/components/DataQuality/AddTestCaseList/AddTestCaseList.component.tsx src/components/DataQuality/AddTestCaseList/AddTestCaseList.interface.ts src/components/DataQuality/AddTestCaseList/AddTestCaseListFilters.component.tsx src/components/DataQuality/BundleSuiteForm/BundleSuiteForm.test.tsx src/components/DataQuality/BundleSuiteForm/BundleSuiteForm.tsx src/components/DataQuality/IncidentManager/TestCaseResultTab/TestCaseResultTabClassBase.ts src/components/Database/ColumnDetailPanel/ColumnDetailPanel.component.tsx src/components/Domain/AddDomainForm/AddDomainForm.component.tsx src/components/Domain/DomainDetails/DomainDetails.component.tsx src/components/Entity/EntityRightPanel/EntityRightPanelVerticalNav.tsx src/components/Entity/Task/TaskTab/TaskTabNew.component.test.tsx src/components/Entity/Task/TaskTab/TaskTabNew.component.tsx src/components/Explore/EntitySummaryPanel/EntitySummaryPanel.component.tsx src/components/Explore/ExploreQuickFilters.test.tsx src/components/Explore/ExploreQuickFilters.tsx src/components/Glossary/GlossaryDetails/GlossaryDetails.component.tsx src/components/IncidentManager/IncidentManager.component.tsx src/components/OntologyExplorer/FilterToolbar.tsx src/components/OntologyExplorer/OntologyExplorer.constants.ts src/components/OntologyExplorer/OntologyExplorer.interface.ts src/components/OntologyExplorer/OntologyExplorer.tsx src/components/OntologyExplorer/OntologyGraphG6.tsx src/components/OntologyExplorer/OntologyNodeRelationsContent.tsx src/components/OntologyExplorer/hooks/useGraphData.ts src/components/OntologyExplorer/hooks/useOntologyGraph.ts src/components/OntologyExplorer/utils/graphConfig.ts src/components/OntologyExplorer/utils/graphStyles.ts src/components/OntologyExplorer/utils/layoutCalculations.ts src/components/Settings/Applications/AppLogsViewer/AppLogsViewer.test.tsx src/components/Settings/Team/TeamDetails/TeamDetailsV1.tsx src/components/Settings/Team/TeamDetails/TeamsHeaderSection/TeamsInfo.component.tsx src/components/Settings/Team/TeamDetails/UserTab/UserTab.component.tsx src/components/TestLibrary/TestDefinitionList/TestDefinitionList.component.tsx src/components/common/MUIGlossaryTagSuggestion/MUIGlossaryTagSuggestion.tsx src/components/common/Table/Table.tsx src/components/common/TierCard/TierCard.tsx src/components/common/UserSelectableList/UserSelectableList.component.tsx src/components/common/atoms/data/useDataFetching.tsx src/components/common/atoms/drawer/useFormDrawer.tsx src/constants/Services.constant.ts src/locale/languages/ar-sa.json src/locale/languages/de-de.json src/locale/languages/en-us.json src/locale/languages/es-es.json src/locale/languages/fr-fr.json src/locale/languages/gl-es.json src/locale/languages/he-he.json src/locale/languages/ja-jp.json src/locale/languages/ko-kr.json src/locale/languages/mr-in.json src/locale/languages/nl-nl.json src/locale/languages/pr-pr.json src/locale/languages/pt-br.json src/locale/languages/pt-pt.json src/locale/languages/ru-ru.json src/locale/languages/th-th.json src/locale/languages/tr-tr.json src/locale/languages/zh-cn.json src/locale/languages/zh-tw.json src/mocks/Task.mock.ts src/pages/ColumnBulkOperations/ColumnGrid/ColumnGrid.component.tsx src/pages/ColumnBulkOperations/ColumnGrid/ColumnGrid.interface.ts src/pages/GlossaryTermRelationSettings/GlossaryTermRelationSettings.tsx src/pages/IncidentManager/IncidentManagerDetailPage/TestCaseClassBase.ts src/pages/ServiceDetailsPage/ServiceDetailsPage.tsx src/pages/TasksPage/shared/DiffViewNew.tsx src/pages/TestSuiteDetailsPage/TestSuiteDetailsPage.component.tsx src/pages/TestSuiteDetailsPage/TestSuiteDetailsPage.test.tsx src/rest/searchAPI.ts src/rest/testAPI.ts src/utils/CSV/CSV.utils.tsx src/utils/EntityPatchUtils.ts src/utils/EntitySummaryPanelUtilsV1.tsx src/utils/EntityUtilClassBase.ts src/utils/EntityUtils.tsx src/utils/ExploreUtils.tsx src/utils/MessagingServiceUtils.ts src/utils/QueryBuilderUtils.tsx src/utils/ServiceUtilClassBase.ts src/utils/Users.util.tsx

Fix locally (fast — changed files only):

cd openmetadata-ui/src/main/resources/ui
yarn ui-checkstyle:changed

Or to fix all files: yarn ui-checkstyle

Copilot AI review requested due to automatic review settings March 26, 2026 14:58
@gitar-bot
Copy link

gitar-bot bot commented Mar 26, 2026

Code Review ⚠️ Changes requested 3 resolved / 4 findings

Migrates failed sample rows to OpenSearch with fixes for test signatures, generator efficiency, and DataFrame handling, but the boolean-mask indexing exception handler silently masks real errors and filter() result handling needs clarification.

⚠️ Bug: Bare except masks real errors in boolean-mask indexing path

📄 ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:51-54 📄 ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:53-56 📄 ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:36 📄 ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:43

The try/except Exception block at lines 51-54 silently catches any error during prepared_chunk[criteria] and falls through to treating criteria as an already-filtered DataFrame. This will hide real bugs — for example, if criteria is a boolean Series with a mismatched index, a malformed mask, or any other unexpected type, the error is swallowed and criteria (which could be garbage) is used as the filtered result.

This can lead to incorrect failed-row samples being returned silently, making debugging very difficult.

Suggested fix
Narrow the except clause to the specific expected exception types (e.g., `TypeError`, `ValueError`, `KeyError`) and log a warning so unexpected failures are visible:

    try:
        filtered_chunk = prepared_chunk[criteria]
    except (TypeError, KeyError, ValueError):
        logger.debug("Boolean mask indexing failed; treating filter result as pre-filtered DataFrame")
        filtered_chunk = criteria
✅ 3 resolved
Bug: Unit tests call result_with_failed_samples with wrong signature

📄 ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py:63 📄 ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py:74 📄 ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py:86 📄 ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py:98 📄 ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py:109 📄 ingestion/tests/unit/data_quality/validations/test_failed_sample_mixin.py:121 📄 ingestion/src/metadata/data_quality/validations/mixins/failed_sample_validator_mixin.py:50
Every test in TestFailedSampleValidatorMixin calls validator.result_with_failed_samples(test_case, result) with two arguments, but the actual mixin method signature is result_with_failed_samples(self, result: TestCaseResultResponse) — it takes only one argument. This will raise TypeError: result_with_failed_samples() takes 2 positional arguments but 3 were given on every test invocation.

Additionally, the test creates result as MagicMock(spec=TestCaseResult) but the method expects a TestCaseResultResponse which wraps both testCase and testCaseResult. The tests need to construct a proper TestCaseResultResponse (or mock thereof) and pass only that single object.

Performance: PandasFailedRowSamplerMixin creates two generators needlessly

📄 ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:32-42
_get_failed_rows_sample() calls self.runner() on line 33 just to extract column names via next(), then calls self.runner() again on line 34 to iterate chunks. Since runner() is a factory that creates a new generator each time, the first generator is created, partially consumed (one chunk), then discarded. This is wasteful and fragile — if runner() were ever changed to return a single generator, this would silently lose the first chunk of data.

Extract columns from the first chunk inside the iteration loop instead.

Bug: DataFrame engine discards failedRowsSample by returning only testCaseResult

📄 ingestion/src/metadata/sdk/data_quality/dataframes/dataframe_validation_engine.py:87-88
In DataFrameValidationEngine._execute_single_test(), the code was changed to result = validator.run_validation(); return result.testCaseResult. This extracts only the TestCaseResult and discards the failedRowsSample and inspectionQuery fields that were populated by result_with_failed_samples() during run_validation().

The SQLAlchemy path (TestSuiteInterface.run_test_case()) returns the full TestCaseResultResponse, which the sink's write_test_case_results() then uses to call _ingest_failed_rows_sample(). By stripping the response down to just testCaseResult, the DataFrame/Pandas validation path will never send failed row samples to OpenMetadata, defeating the purpose of this PR for Pandas-based validators.

The method should return the full TestCaseResultResponse object (updating the return type annotation accordingly), or the caller should be adapted to handle both formats.

🤖 Prompt for agents
Code Review: Migrates failed sample rows to OpenSearch with fixes for test signatures, generator efficiency, and DataFrame handling, but the boolean-mask indexing exception handler silently masks real errors and filter() result handling needs clarification.

1. ⚠️ Bug: Bare except masks real errors in boolean-mask indexing path
   Files: ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:51-54, ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:53-56, ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:36, ingestion/src/metadata/data_quality/validations/mixins/failed_row_sampler_mixin.py:43

   The `try/except Exception` block at lines 51-54 silently catches *any* error during `prepared_chunk[criteria]` and falls through to treating `criteria` as an already-filtered DataFrame. This will hide real bugs — for example, if `criteria` is a boolean Series with a mismatched index, a malformed mask, or any other unexpected type, the error is swallowed and `criteria` (which could be garbage) is used as the filtered result.
   
   This can lead to incorrect failed-row samples being returned silently, making debugging very difficult.

   Suggested fix:
   Narrow the except clause to the specific expected exception types (e.g., `TypeError`, `ValueError`, `KeyError`) and log a warning so unexpected failures are visible:
   
       try:
           filtered_chunk = prepared_chunk[criteria]
       except (TypeError, KeyError, ValueError):
           logger.debug("Boolean mask indexing failed; treating filter result as pre-filtered DataFrame")
           filtered_chunk = criteria

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 123 out of 123 changed files in this pull request and generated 3 comments.

@sonarqubecloud
Copy link

@TeddyCr TeddyCr enabled auto-merge (squash) March 26, 2026 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Ingestion safe to test Add this label to run secure Github workflows on PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Migrate Sample data failure to OpenSource

3 participants