Bump mlx-lm/mlx-vlm deps and add Qwen3.5-0.8B smoke test by ricky-chaoju · Pull Request #174 · vllm-project/vllm-metal

ricky-chaoju · 2026-03-18T09:20:14Z

Summary

Follow-up to #169 (transformers>=5.0.0 upgrade).

Bump mlx>=0.31.0, mlx-lm>=0.31.0, mlx-vlm>=0.4.0
Add tests/test_qwen35_smoke.py: golden token comparison for Qwen/Qwen3.5-0.8B (greedy decoding, 5/5 passed)

Qwen3.5 uses the qwen3_5 architecture which requires transformers>=5.0.0 and mlx-lm>=0.30.0. This proves the upgraded dependency stack works end-to-end on Metal.

Test

test_qwen35_smoke

test_paged_deterministic

LxYuan0420 · 2026-03-18T09:31:59Z

Merged #169 now, sorry for the delay. LGTM and just needs a rebase since the transformers bump commit (b2fa307) is already on main now.

Signed-off-by: Chao-Ju Chen <ricky.chen@infinirc.com>

ricky-chaoju · 2026-03-18T09:34:18Z

done! Thanks @LxYuan0420

ricky-chaoju · 2026-03-18T11:09:46Z

@LxYuan0420 Please check #176 first before merging this one. Thanks!

WindChimeRan · 2026-03-18T19:46:23Z

@ricky-chaoju Thanks! This PR will give us groundtruth labels and unblock the flash-linear-attn for the paged path!

ricky-chaoju mentioned this pull request Mar 18, 2026

Upgrade transformers to >=5.0.0 #169

Merged

Bump mlx deps and add Qwen3.5-0.8B smoke test

fdef8d3

Signed-off-by: Chao-Ju Chen <ricky.chen@infinirc.com>

ricky-chaoju force-pushed the deps/bump-mlx-deps branch from f4dc461 to fdef8d3 Compare March 18, 2026 09:33

LxYuan0420 approved these changes Mar 18, 2026

View reviewed changes

WindChimeRan mentioned this pull request Mar 18, 2026

[RoadMap] [Paged KV] Continuous Batching + Chunked Prefilling + paged varlen flash att #148

Open

31 tasks

LxYuan0420 merged commit 3c683ea into vllm-project:main Mar 19, 2026
7 of 13 checks passed

This was referenced Mar 19, 2026

Add Qwen3.5 model support #123

Closed

Fix Qwen3.5 MoE load path in vLLM Metal #129

Closed

ricky-chaoju deleted the deps/bump-mlx-deps branch March 19, 2026 02:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump mlx-lm/mlx-vlm deps and add Qwen3.5-0.8B smoke test #174

Bump mlx-lm/mlx-vlm deps and add Qwen3.5-0.8B smoke test #174
LxYuan0420 merged 1 commit intovllm-project:mainfrom
ricky-chaoju:deps/bump-mlx-deps

ricky-chaoju commented Mar 18, 2026 •

edited

Loading

Uh oh!

LxYuan0420 commented Mar 18, 2026

Uh oh!

ricky-chaoju commented Mar 18, 2026

Uh oh!

ricky-chaoju commented Mar 18, 2026

Uh oh!

WindChimeRan commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ricky-chaoju commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test

test_qwen35_smoke

test_paged_deterministic

Uh oh!

LxYuan0420 commented Mar 18, 2026

Uh oh!

ricky-chaoju commented Mar 18, 2026

Uh oh!

ricky-chaoju commented Mar 18, 2026

Uh oh!

WindChimeRan commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ricky-chaoju commented Mar 18, 2026 •

edited

Loading