Add Qwen3.5 model support via opt-in dependency extra by ricky-chaoju · Pull Request #154 · vllm-project/vllm-metal

ricky-chaoju · 2026-03-11T14:24:06Z

Summary

Enable Qwen3.5 (dense and MoE) model support by adding a [qwen35] optional dependency extra. Base dependencies unchanged — existing users unaffected.

Consolidates and supersedes #121, #123, #129 .

Why the original PRs' runtime fixes are unnecessary

Original PR fix	Why not needed
#121 hybrid cache batched decode fallback	mlx-lm ≥0.31.0 Qwen3.5 attention correctly handles `BatchKVCache.offset` as `mx.array` — batched decode works without fallback
#123 rope validation monkeypatch	Qwen3.5 config has `partial_rotary_factor=0.25`, which triggers transformers' built-in `set()` coercion path
#129 mlx-lm model alias shim	mlx-lm ≥0.31.0 natively includes `qwen3_5` and `qwen3_5_moe` modules

Installation

# Qwen3.5 users                                                                                                                                                         
VLLM_VERSION=0.17.0 ./install.sh                                                                                                                                        
pip install 'vllm-metal[qwen35]'

Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>

LxYuan0420

Thanks for consolidating those PR. The compatibility analysis is useful and we'll reference it.

Closing this PR because:

transformers>=5.0.0 in an optional extra isn't isolated — it upgrades the entire environment and breaks vLLM 0.14.x + all existing models. Transformers v5 may be fine but we need to verify carefully.
vLLM upgrade is a project-wide migration, not a per-model opt-in. We're waiting on #134 (@ericcurtin's wheel install) to confirm the macOS path before bumping further.
Qwen3.5 has ongoing MLX incompatibilities — cache is broken for hybrid architectures (mlx-lm#980, mlx-lm#903), server tool calls fail (mlx-lm#905). Not ready to claim support until upstream stabilizes.
VLLM_VERSION env var is fragile — default install must always produce a working setup.
No smoke tests for existing models or end-to-end tests for Qwen3.5.

Suggested approach: Break this into steps: (1) first land the baseline upgrade with smoke tests proving existing models still work, then (2) add Qwen3.5 support once upstream MLX issuesare resolved.

Happy to review a re-scoped version. 🙏

Upgrade deps and installer for Qwen3.5 support

8c07c3c

Signed-off-by: RickyChen / 陳昭儒 <ricky.chen@infinirc.com>

ricky-chaoju force-pushed the feat/qwen3.5-dependency-upgrade branch from b0b35a7 to 8c07c3c Compare March 11, 2026 14:24

ricky-chaoju marked this pull request as ready for review March 11, 2026 14:29

LxYuan0420 reviewed Mar 11, 2026

View reviewed changes

LxYuan0420 closed this Mar 11, 2026

ricky-chaoju deleted the feat/qwen3.5-dependency-upgrade branch March 17, 2026 07:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Qwen3.5 model support via opt-in dependency extra#154

Add Qwen3.5 model support via opt-in dependency extra#154
ricky-chaoju wants to merge 1 commit intovllm-project:mainfrom
ricky-chaoju:feat/qwen3.5-dependency-upgrade

ricky-chaoju commented Mar 11, 2026

Uh oh!

LxYuan0420 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ricky-chaoju commented Mar 11, 2026

Summary

Why the original PRs' runtime fixes are unnecessary

Installation

Uh oh!

LxYuan0420 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants