Skip to content

fix(xml): remove recursive sublanguage references to prevent ReDoS#4361

Open
petejm wants to merge 1 commit intohighlightjs:mainfrom
petejm:fix/redos-xml-sublanguage-recursion
Open

fix(xml): remove recursive sublanguage references to prevent ReDoS#4361
petejm wants to merge 1 commit intohighlightjs:mainfrom
petejm:fix/redos-xml-sublanguage-recursion

Conversation

@petejm
Copy link

@petejm petejm commented Feb 22, 2026

Summary

  • Removes xml from <style> sublanguage array (was ['css', 'xml'], now 'css')
  • Removes xml and handlebars from <script> sublanguage array (was ['javascript', 'handlebars', 'xml'], now 'javascript')
  • Updates sublanguage_no_relevancy test expectations to match new behavior

Problem

Crafted HTML input with many unclosed <script> tags causes exponential backtracking because:

  1. Direct recursion: xml listed as a sublanguage of itself in <script>/<style> tags
  2. Indirect recursion: handlebars uses xml as its base sublanguage, creating xml -> <script> -> handlebars -> xml -> ...

The highlight.js engine has no sublanguage recursion depth limit, so this causes resource exhaustion (denial of service). The PoC from #4261 hangs the browser; after this fix it completes in ~9ms.

This is consistent with the approach taken on the main_12 branch (commits ca21a19 and 18d4182), backported to v11.

Fixes #4261

Behavioral changes

Removing xml from <style> and <script>: No practical impact. The xml fallback only activated when CSS/JS parsing failed, in which case the content isn't XML either — it would just produce misleading highlighting.

Removing handlebars from <script>: This is a minor behavior change. Previously, <script> content was auto-detected between JavaScript and Handlebars by relevance. Now it always highlights as JavaScript. In practice this is low-impact because:

  • Handlebars templates (e.g. <script type="text/x-handlebars-template">) are better served by specifying language="handlebars" directly, which uses the dedicated handlebars.js grammar with full XML + Handlebars support
  • The auto-detection was unreliable for short Handlebars snippets anyway (often scoring below JavaScript relevance)
  • This matches v12's effective behavior, which drops multi-sublanguage relevance comparison

Test plan

  • All 1570 tests pass (npm test)
  • Updated sublanguage_no_relevancy.expect.txt for new <script> behavior (content now wrapped in language-javascript span since sublanguage is a string, not array)
  • Verified PoC from Resource exhaustion #4261 completes in ~9ms (was infinite hang)

🤖 Generated with Claude Code

Remove `xml` from the sublanguage fallback arrays in `<style>` and
`<script>` tag handling. Also remove `handlebars` from `<script>`
since handlebars itself uses xml as its base sublanguage, creating
an indirect recursion path (xml -> handlebars -> xml).

The recursive sublanguage references allow crafted input with many
unclosed `<script>` tags to cause exponential backtracking, leading
to resource exhaustion (denial of service).

Fixes highlightjs#4261

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resource exhaustion

1 participant