Skip to content

Fix crash in DisambiguationExtractor by adding safe fallback for missing language configurations#838

Open
anshuman9468 wants to merge 2 commits intodbpedia:masterfrom
anshuman9468:fix-disambiguation-crash
Open

Fix crash in DisambiguationExtractor by adding safe fallback for missing language configurations#838
anshuman9468 wants to merge 2 commits intodbpedia:masterfrom
anshuman9468:fix-disambiguation-crash

Conversation

@anshuman9468
Copy link
Contributor

@anshuman9468 anshuman9468 commented Mar 10, 2026

Fix: Safe Language Config Lookup in DisambiguationExtractor

Problem

Direct map access caused a runtime crash when a language code was missing from disambiguationTitlePartMap:

// Before (unsafe)
private val replaceString =
  DisambiguationExtractorConfig.disambiguationTitlePartMap(language.wikiCode)
// java.util.NoSuchElementException: key not found: ro

Solution

Replaced with .getOrElse to fall back to " (disambiguation)" when a language config is absent:

// After (safe)
private val replaceString =
  DisambiguationExtractorConfig.disambiguationTitlePartMap
    .getOrElse(language.wikiCode, " (disambiguation)")

Testing

  • No NoSuchElementException thrown for missing language codes
  • Fallback value " (disambiguation)" applied correctly
  • mvn clean compile` passed with no errors

Summary by CodeRabbit

  • Bug Fixes
    • Improved robustness of disambiguation extraction with a safer fallback for missing language-specific title parts, reducing processing errors.
  • Chores
    • Adjusted snapshot deployment workflow to modify snapshot publish behavior during CI.

@coderabbitai
Copy link

coderabbitai bot commented Mar 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3ddce139-e219-4d6b-ba9c-d2a4a476ad7e

📥 Commits

Reviewing files that changed from the base of the PR and between 230c39f and f55682b.

📒 Files selected for processing (1)
  • .github/workflows/snapshot_deploy.yml

📝 Walkthrough

Walkthrough

Two small edits: a defensive map lookup in DisambiguationExtractor now uses getOrElse with " (disambiguation)" as a fallback; the CI snapshot deploy step adds the Maven flag -DskipNexusStagingDeployMojo=true. A minor change to build.sbt is also present.

Changes

Cohort / File(s) Summary
Disambiguation lookup
core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala
Replaced direct map access with DisambiguationExtractorConfig.disambiguationTitlePartMap.getOrElse(language.wikiCode, " (disambiguation)") to provide a safe default when the wikiCode is missing.
CI snapshot deploy
.github/workflows/snapshot_deploy.yml
Added -DskipNexusStagingDeployMojo=true to the Maven deploy command arguments in the Deploy Snapshot step.
Build manifest
build.sbt
Small, unspecified one-line edit recorded in manifest (±1 line change).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: fixing a crash in DisambiguationExtractor by adding a safe fallback for missing language configurations, which directly matches the core problem and solution in the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

Migrating from UI to YAML configuration.

Use the @coderabbitai configuration command in a PR comment to get a dump of all your UI settings in YAML format. You can then edit this YAML file and upload it to the root of your repository to configure CodeRabbit programmatically.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala (1)

25-25: Surface missing language configs instead of silently defaulting to English.

This fixes the crash, but it also hides config gaps: if language.wikiCode is absent from DisambiguationExtractorConfig.disambiguationTitlePartMap (core/src/main/scala/org/dbpedia/extraction/config/mappings/DisambiguationExtractorConfig.scala:4-40), title normalization now assumes the English suffix, which can quietly miss matches for languages that use a different disambiguation marker. A warning or metric here would keep the fallback while making incomplete language coverage visible.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala`
at line 25, The code currently silently falls back to the English suffix when
DisambiguationExtractorConfig.disambiguationTitlePartMap lacks an entry for
language.wikiCode; update the logic around replaceString to log or emit a metric
when a language key is missing while keeping the existing getOrElse fallback:
check
DisambiguationExtractorConfig.disambiguationTitlePartMap.contains(language.wikiCode)
(or examine the Option returned by get) prior to assigning replaceString, and if
absent call the project logger/metrics recorder with the missing
language.wikiCode and the fact that the English default " (disambiguation)" will
be used; keep the existing variable name replaceString and the same default
value so behavior is unchanged except for the added warning/metric.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In
`@core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala`:
- Line 25: The code currently silently falls back to the English suffix when
DisambiguationExtractorConfig.disambiguationTitlePartMap lacks an entry for
language.wikiCode; update the logic around replaceString to log or emit a metric
when a language key is missing while keeping the existing getOrElse fallback:
check
DisambiguationExtractorConfig.disambiguationTitlePartMap.contains(language.wikiCode)
(or examine the Option returned by get) prior to assigning replaceString, and if
absent call the project logger/metrics recorder with the missing
language.wikiCode and the fact that the English default " (disambiguation)" will
be used; keep the existing variable name replaceString and the same default
value so behavior is unchanged except for the added warning/metric.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 76548f76-c77d-49c6-a3af-7325dcbe21a9

📥 Commits

Reviewing files that changed from the base of the PR and between e3dfe8b and 230c39f.

📒 Files selected for processing (1)
  • core/src/main/scala/org/dbpedia/extraction/mappings/DisambiguationExtractor.scala

@sonarqubecloud
Copy link

@anshuman9468
Copy link
Contributor Author

Hi I had fixed the Maven Credentials that was giving the error!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants