Skip to content

feat(pdf): add MinerU Cloud API as PDF parsing provider#438

Merged
cosarah merged 3 commits intomainfrom
worktree-mineru-cloud
Apr 16, 2026
Merged

feat(pdf): add MinerU Cloud API as PDF parsing provider#438
cosarah merged 3 commits intomainfrom
worktree-mineru-cloud

Conversation

@wyuc
Copy link
Copy Markdown
Contributor

@wyuc wyuc commented Apr 16, 2026

Summary

  • Add MinerU Cloud (v4 API) as a new PDF parsing provider alongside unpdf (built-in) and mineru (self-hosted)
  • Users can parse PDFs via the MinerU Cloud API without deploying a self-hosted instance
  • Extract shared parser into mineru-parser.ts for code reuse between self-hosted and cloud paths

Changed Files

File Change
lib/pdf/mineru-parser.ts NEW — shared MinerU result parser (extracted from inline code)
lib/pdf/mineru-cloud.ts NEW — MinerU Cloud v4 API client (batch → upload → poll → ZIP → parse)
lib/pdf/types.ts Add 'mineru-cloud' to PDFProviderId union
lib/pdf/constants.ts Add provider entry + shared MINERU_CLOUD_DEFAULT_BASE constant
lib/pdf/pdf-providers.ts Add switch case, replace inline parser with shared import
lib/server/provider-config.ts Add PDF_MINERU_CLOUD env var mapping, fix PDF activation logic
lib/store/settings.ts Add default config, update auto-switch priority (cloud > self-hosted)
components/settings/pdf-settings.tsx Adapt UI: cloud (API key required, base URL optional) vs self-hosted
app/api/verify-pdf-provider/route.ts Add cloud verification path with SSRF protection
lib/i18n/locales/*.json Add i18n strings for 4 locales

Design Decisions

  • Cloud vs self-hosted UI: Cloud shows API key input first (required) with test button; self-hosted shows base URL first (required). Both support optional secondary fields.
  • Provider activation: Changed provider-config.ts from requiresBaseUrl: true to keylessProviders set — mineru activates on base URL alone, mineru-cloud activates on API key.
  • Shared parser: extractMinerUResult extracted to mineru-parser.ts — pure refactor, no behavior change for existing self-hosted path.
  • Model version: Uses vlm (recommended by MinerU docs) instead of pipeline (default).

Code Review Summary

Two rounds of automated code review were performed:

Round 1 found and we fixed:

  • ✅ SSRF gap in cloud verification path — added validateUrlForSSRF before fetch
  • ✅ API request body mismatch — changed file_names: [string] to files: [{name: string}] per official docs

Round 2 (final review) result: Ready to merge

  • 0 Critical issues
  • 3 Important (all accepted as low-risk):
    • sourceFileName not threaded to cloud client (falls back to document.pdf, works correctly)
    • Presigned/ZIP URLs from API response not SSRF-checked (trusted API response, not user input)
    • Server-configured URLs trusted without SSRF (consistent with all other providers)
  • 4 Minor (2 fixed, 2 deferred):
    • ✅ Duplicate MINERU_CLOUD_DEFAULT_BASE constant — extracted to constants.ts
    • ✅ Unguarded JSON.parse in shared parser — added try-catch
    • language: 'ch' hardcoded — deferred to follow-up (make configurable)
    • Verbose Blob construction — cosmetic, not blocking

CI checks: pnpm check ✅ | pnpm lint ✅ (0 errors) | npx tsc --noEmit

Test Plan

  • Settings page: MinerU (Cloud) appears in PDF parser selector
  • Cloud UI: API key input (required) + base URL input (optional, placeholder https://mineru.net/api/v4)
  • Self-hosted MinerU UI: unchanged behavior (base URL + optional API key)
  • unpdf UI: unchanged behavior (no config fields)
  • Test connection button: enabled when API key is entered (cloud) / base URL is entered (self-hosted)
  • PDF parsing via MinerU Cloud: end-to-end successful
  • TypeScript, ESLint, Prettier all pass

🤖 Generated with Claude Code

wyuc and others added 3 commits April 16, 2026 13:09
Add MinerU Cloud (v4 API) as a new PDF provider alongside unpdf (built-in)
and MinerU (self-hosted). Users can now parse PDFs via the cloud API without
deploying a self-hosted MinerU instance.

- New provider `mineru-cloud` with API key auth and optional base URL
- Cloud flow: batch create → presigned upload → poll → ZIP download → parse
- Extract shared `mineru-parser.ts` from inline code (used by both paths)
- Settings UI adapted for cloud (API key required) vs self-hosted (base URL required)
- Server-side env var support: PDF_MINERU_CLOUD_API_KEY / PDF_MINERU_CLOUD_BASE_URL
- SSRF protection on cloud verification endpoint
- i18n translations for en-US, zh-CN, ja-JP, ru-RU

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverts the PDF section from keylessProviders back to requiresBaseUrl: true
to fix failing test: mineru with only apiKey (no baseUrl) should not activate.
mineru-cloud API key can be configured via browser settings UI instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test connection button was disabled when the API key was only
configured server-side (not entered by user in browser). Now checks
isServerConfigured in addition to user-entered values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@cosarah cosarah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
verified locally

@cosarah cosarah merged commit 91c4015 into main Apr 16, 2026
3 checks passed
@wyuc wyuc deleted the worktree-mineru-cloud branch April 16, 2026 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants