Adapt search pipeline to prefer semantic results by lautel · Pull Request #26771 · open-metadata/OpenMetadata

lautel · 2026-03-25T17:12:14Z

Remove dataAssetEmbeddings alias from entities with no embeddings: tag, container, file, worksheet, spreadsheet, directory.
Change default search pipeline for hybrid search: k=30, weights=[0.4, 0.6] --> semantic results have more weight

github-actions · 2026-03-25T17:21:23Z

OpenMetadata Service New-Code Coverage

✅ PASS. Required changed-line coverage: 90.00% overall and per touched production file.

Overall executable changed lines: 3/3 covered (100.00%)
Missed executable changed lines: 0
Non-executable changed lines ignored by JaCoCo: 0
Changed production files: 2

File	Covered	Missed	Executable	Non-exec	Coverage	Uncovered lines
`openmetadata-service/src/main/java/org/openmetadata/service/search/SearchRepository.java`	2	0	2	0	100.00%	-
`openmetadata-service/src/main/java/org/openmetadata/service/search/vector/OpenSearchVectorService.java`	1	0	1	0	100.00%	-

Only changed executable lines under openmetadata-service/src/main/java are counted. Test files, comments, imports, and non-executable lines are excluded.

github-actions · 2026-03-25T19:01:14Z

🟡 Playwright Results — all passed (14 flaky)

✅ 3402 passed · ❌ 0 failed · 🟡 14 flaky · ⏭️ 216 skipped

Shard	Passed	Flaky	Skipped
🟡 Shard 1	453	2	2
🟡 Shard 2	602	2	32
🟡 Shard 3	606	3	27
🟡 Shard 4	601	2	47
🟡 Shard 5	586	1	67
🟡 Shard 6	554	4	41

🟡 14 flaky test(s) (passed on retry)

Features/CustomizeDetailPage.spec.ts › API Endpoint - customization should work (shard 1, 1 retry)
Pages/UserCreationWithPersona.spec.ts › Create user with persona and verify on profile (shard 1, 1 retry)
Features/BulkEditEntity.spec.ts › Glossary (shard 2, 1 retry)
Features/DomainTierCertificationVoting.spec.ts › DataProduct - Certification assign, update, and remove (shard 2, 1 retry)
Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
Features/UserProfileOnlineStatus.spec.ts › Should show "Active recently" for users active within last hour (shard 3, 1 retry)
Flow/ExploreDiscovery.spec.ts › Should display deleted assets when showDeleted is checked and deleted is not present in queryFilter (shard 3, 1 retry)
Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
Pages/DataContracts.spec.ts › Create Data Contract and validate for Table (shard 4, 1 retry)
Pages/EntityDataSteward.spec.ts › Glossary Term Add, Update and Remove (shard 5, 1 retry)
Pages/ExploreTree.spec.ts › Verify Database and Database Schema available in explore tree (shard 6, 1 retry)
Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
Pages/Users.spec.ts › Check permissions for Data Steward (shard 6, 1 retry)
VersionPages/EntityVersionPages.spec.ts › Directory (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

tomasmontielp · 2026-03-26T07:51:09Z

...ta-service/src/main/java/org/openmetadata/service/search/vector/OpenSearchVectorService.java

            .createObjectNode()
            .put("technique", "rrf")
-            .put("rank_constant", 60)
+            .put("rank_constant", 30)


motivation behind reducing denominator constant? overall curious about this variable, 60 as default also intriguing

60 is the default rank constant. Everywhere you see RRF, the default is 60. Thing is, it makes scores quite uniform and masks very high ranking items. I halved it to maintain clearer differences in scores between rank positions.

* Update CLAUDE.md with environment setup and worktree instructions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR review feedback on CLAUDE.md environment setup - Fix Python version range to 3.10-3.11 (matches CI matrix and noxfile) - Fix "claustre" typo to "Claude Code" - Remove hardcoded ~/Code/OpenMetadata/env paths, use generic references - Reorder commands: install_dev_env before make generate (Makefile requires it) - Soften environment-specific assertions about system Python Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

gitar-bot · 2026-03-27T14:10:51Z

Code Review ✅ Approved

Adapts the search pipeline to prioritize semantic results over keyword-based matches, improving search recall. No issues found.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

sonarqubecloud · 2026-03-27T15:09:40Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Adapt search pipeline to prefer semantic results

a87cdf5

lautel added the safe to test Add this label to run secure Github workflows on PRs label Mar 25, 2026

Merge branch 'main' into improve-search-recall

70f2af1

pmbrull previously approved these changes Mar 25, 2026

View reviewed changes

lautel temporarily deployed to test March 25, 2026 17:21 — with GitHub Actions Inactive

Merge branch 'main' into improve-search-recall

fe1a1aa

lautel had a problem deploying to test March 26, 2026 07:05 — with GitHub Actions Failure

lautel temporarily deployed to test March 26, 2026 07:05 — with GitHub Actions Inactive

tomasmontielp reviewed Mar 26, 2026

View reviewed changes

Merge branch 'main' into improve-search-recall

ec1b1e0

lautel temporarily deployed to test March 26, 2026 14:02 — with GitHub Actions Inactive

lautel had a problem deploying to test March 26, 2026 14:02 — with GitHub Actions Failure

lautel temporarily deployed to test March 26, 2026 14:02 — with GitHub Actions Inactive

lautel dismissed pmbrull’s stale review via ca9ccb8 March 26, 2026 17:21

lautel temporarily deployed to test March 26, 2026 17:32 — with GitHub Actions Inactive

Merge branch 'main' into improve-search-recall

382276a

lautel temporarily deployed to test March 27, 2026 07:45 — with GitHub Actions Inactive

Merge branch 'main' into improve-search-recall

ec60120

lautel temporarily deployed to test March 27, 2026 14:20 — with GitHub Actions Inactive

lautel requested a deployment to test March 27, 2026 14:20 — with GitHub Actions In progress

lautel temporarily deployed to test March 27, 2026 14:20 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adapt search pipeline to prefer semantic results#26771

Adapt search pipeline to prefer semantic results#26771
lautel wants to merge 7 commits intomainfrom
improve-search-recall

lautel commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

tomasmontielp Mar 26, 2026

Uh oh!

lautel Mar 26, 2026

Uh oh!

gitar-bot bot commented Mar 27, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

lautel commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

OpenMetadata Service New-Code Coverage

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟡 Playwright Results — all passed (14 flaky)

Uh oh!

tomasmontielp Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

lautel Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

gitar-bot bot commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sonarqubecloud bot commented Mar 27, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

github-actions bot commented Mar 25, 2026 •

edited

Loading

github-actions bot commented Mar 25, 2026 •

edited

Loading

gitar-bot bot commented Mar 27, 2026 •

edited

Loading