[WIP] Change Event Based Governance Workflow#26758
[WIP] Change Event Based Governance Workflow#26758yan-3005 wants to merge 3 commits intoram/workflow-improvementsfrom
Conversation
…ted task nodes
Phase 1 of batch entity processing in governance workflows. All
automated task nodes (checkEntityAttributesTask, setEntityAttributeTask,
checkChangeDescriptionTask, rollbackEntityTask, sinkTask,
dataCompletenessTask) now process a List<String> of entity links via
entityList exclusively. The relatedEntity fallback in getEntityList()
is removed — batch nodes no longer have that path.
Key changes:
- PeriodicBatchEntityTrigger (singleExecutionMode=false): each child
process now receives global_entityList via ${entityToListMap[relatedEntity]}.
FetchEntitiesImpl pre-builds entityToListMap (entity -> List.of(entity))
so the JUEL expression resolves without static class references.
- Batch node impls (6 files): removed relatedEntity fallback from
getEntityList() and the now-unused RELATED_ENTITY_VARIABLE import.
entityList is the only input path.
- BPMN builders: putIfAbsent(ENTITY_LIST_VARIABLE, GLOBAL_NAMESPACE) for
all batch-capable nodes; relatedEntity is never added to inputNamespaceMap
by builders.
- v1130 migration (addEntityListToNamespaceMap): updated to also strip
relatedEntity from batch node inputNamespaceMaps, covering both fresh
upgrades (add entityList + remove relatedEntity) and instances that
already ran the previous migration (entityList present, remove relatedEntity).
Migration remains idempotent.
- GlossaryApprovalWorkflow.json: removed relatedEntity from all 13 batch
node inputNamespaceMaps. userApprovalTask nodes keep relatedEntity.
- JSON schemas: entityList added to trigger output and batch node
inputNamespaceMap/input definitions.
- Integration tests: updated to reflect entityList-first structure across
all batch-capable workflow node configurations.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
| "additionalItems": false, | ||
| "minItems": 1, | ||
| "maxItems": 1, | ||
| "maxItems": 2, |
There was a problem hiding this comment.
🚨 Bug: Schema maxItems:2 rejects 3-element output arrays used in tests
The periodicBatchEntityTrigger.json schema changed maxItems from 1 to 2, but integration tests (and likely real workflow definitions) use 3-element output arrays like ["relatedEntity", "entityList", "updatedBy"]. With maxItems: 2, these will fail JSON Schema validation.
There are 12+ instances in WorkflowDefinitionResourceIT.java with 3-item outputs. The schema constraint must be raised to at least 3, or removed entirely if the output list is open-ended.
Suggested fix:
In periodicBatchEntityTrigger.json, change maxItems to accommodate all valid outputs:
"maxItems": 3
Or remove the maxItems constraint entirely if the list of outputs may grow further.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| triggerWorkflowId, | ||
| (PeriodicBatchEntityTriggerDefinition) workflow.getTrigger(), | ||
| hasBatchModeNodes(workflow)); | ||
| false, |
There was a problem hiding this comment.
⚠️ Bug: singleExecutionMode hardcoded to false, batch sink broken
In TriggerFactory.java, singleExecutionMode is now hardcoded to false (line 27), removing the dynamic hasBatchModeNodes() check that previously detected sink tasks needing batch processing. This means workflows with batch-mode sink nodes will now always spawn N parallel instances instead of a single instance processing all entities, breaking the batch sink use case.
The PeriodicBatchEntityTrigger constructor still accepts and uses the parameter (controlling cardinality and input mapping), but the factory never passes true. Tests still test both modes, masking this gap.
Suggested fix:
Either restore the hasBatchModeNodes() logic to dynamically set singleExecutionMode, or if this is intentional (WIP), add a TODO comment explaining the plan to re-enable it.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
| private String resolveEntityType(DelegateExecution execution) { | ||
| String processDefinitionKey = execution.getProcessDefinitionId().split(":")[0]; | ||
| if (processDefinitionKey.contains("-")) { | ||
| String[] parts = processDefinitionKey.split("-"); | ||
| if (parts.length >= 2) { | ||
| return parts[parts.length - 1]; | ||
| } | ||
| } | ||
| return (String) entityTypesExpr.getValue(execution); | ||
| } |
There was a problem hiding this comment.
💡 Bug: resolveEntityType falls back to expression on ambiguous key
In FetchChangeEventsImpl.resolveEntityType() (lines 136-145), the process definition key is split on - and the last segment is used as the entity type. If the process key contains - but has fewer than 2 parts after splitting (which can't actually happen with split), or if the entity type itself contains a -, the wrong value will be extracted. For example, an entity type like api-endpoint would be parsed as just endpoint.
Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion
…ts-change-event-offset - Keep two-phase migration in MigrationUtil (raw JSON Phase 1, BPMN redeploy Phase 2) - Keep workflowFqn parameter in PeriodicBatchEntityTrigger constructor - Keep FetchChangeEventsImpl as primary fetch task (no FetchEntitiesImpl) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Code Review 🚫 Blocked 0 resolved / 3 findingsChange Event Based Governance Workflow implementation is blocked due to three critical issues: schema validation rejecting 3-element output arrays in tests, singleExecutionMode hardcoded to false breaking batch sink functionality, and resolveEntityType falling back to expression on ambiguous keys. 🚨 Bug: Schema maxItems:2 rejects 3-element output arrays used in tests📄 openmetadata-spec/src/main/resources/json/schema/governance/workflows/elements/triggers/periodicBatchEntityTrigger.json:97 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:2335 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:2435 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:2619 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:3095 📄 openmetadata-integration-tests/src/test/java/org/openmetadata/it/tests/WorkflowDefinitionResourceIT.java:3969 The There are 12+ instances in Suggested fix
|
| Auto-apply | Compact |
|
|
Was this helpful? React with 👍 / 👎 | Gitar
Describe your changes:
Fixes
I worked on ... because ...
Type of change:
Checklist:
Fixes <issue-number>: <short explanation>Summary by Gitar
change_event(entityType, offset)for efficient change-event filteringchange_event_consumers.idfrom VARCHAR(36) to VARCHAR(768) to support workflow consumer IDsFetchChangeEventsImplandCommitChangeEventOffsetImplfor incremental offset trackingentityListalongsiderelatedEntityentityListininputNamespaceMapinstead ofrelatedEntityThis will update automatically on new commits.