[MCP Server] LLM Inference MCP - Multi-model routing, cost optimization

## LLM Inference MCP Server

Multi-model LLM routing with cost optimization, structured output, and batch inference.

### Features
- Smart model routing (8 task profiles)
- Multi-provider support (OpenAI, DeepSeek, Anthropic, vLLM)
- Structured JSON output from any model
- Batch inference (up to 50 parallel prompts)
- Token counting and cost estimation
- Side-by-side model comparison

### Tools (7)
list_models, chat_completion, structured_output, batch_inference, count_tokens, estimate_inference_cost, compare_models

GitHub: https://github.com/zhaohongyuziranerran/llm-inference-mcp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MCP Server] LLM Inference MCP - Multi-model routing, cost optimization #4035

LLM Inference MCP Server

Features

Tools (7)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MCP Server] LLM Inference MCP - Multi-model routing, cost optimization #4035

Description

LLM Inference MCP Server

Features

Tools (7)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions