Skip to content

fix: correctly parse --urls tokens containing colons that are not label separators#53

Open
aneesh-spec wants to merge 5 commits intolangchain-ai:mainfrom
aneesh-spec:aneesh-fix
Open

fix: correctly parse --urls tokens containing colons that are not label separators#53
aneesh-spec wants to merge 5 commits intolangchain-ai:mainfrom
aneesh-spec:aneesh-fix

Conversation

@aneesh-spec
Copy link
Copy Markdown

Problem

Arguments to --urls are sometimes split on the first :, which breaks valid inputs where the colon is part of a Windows drive path or a file: URL—not a Label: prefix.

Expected

A --urls token that is only a path or file: URL should register that full value, same as when the same value is given in YAML/JSON config.

Actual

The registered path/URL is wrong; startup may fail with "file not found" for a file that exists, or behavior diverges from config-based setup.

Examples of broken inputs (before fix)

Input Parsed name Parsed llms_txt
file:///path/to/llms.txt file ///path/to/llms.txt
C:/docs/llms.txt C /docs/llms.txt

Root cause

In mcpdoc/cli.py, the condition to detect name:url format only excluded http: and https: schemes, missing file: URLs and single-letter Windows drive letters.

Fix

  • Added file: to the scheme exclusion list
  • Added a Windows drive path check (single alpha char before :)

Tests

Added 9 F2P (fail-to-pass) tests in tests/unit_tests/test_cli.py:

  • 5 tests target the bug directly — fail on main, pass on this branch
  • 4 regression tests confirm existing Label:url behaviour is unaffected

…el separators

file: URLs and Windows drive paths (e.g. C:/...) were incorrectly split
on the first colon, treating the scheme or drive letter as a label name.
Tests cover file: URLs and Windows drive paths being incorrectly parsed
as label:url pairs. All 5 bug-specific tests fail on main and pass on
this branch.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants