Skip to content

Upgrade transformers to >=5.0.0#169

Merged
LxYuan0420 merged 1 commit intovllm-project:mainfrom
ricky-chaoju:deps/upgrade-transformers-5
Mar 18, 2026
Merged

Upgrade transformers to >=5.0.0#169
LxYuan0420 merged 1 commit intovllm-project:mainfrom
ricky-chaoju:deps/upgrade-transformers-5

Conversation

@ricky-chaoju
Copy link
Contributor

Summary

Upgrade transformers from >=4.40.0 to >=5.0.0. mlx-lm 0.30+ and mlx-vlm 0.3.10+ require transformers 5.x for newer model architectures (Qwen3.5,Nemotron, etc.). vllm upstream is tracking the official upgrade in vllm-project/vllm#30566 but it hasn't landed yet — this unblocks the MLX side first.

Test results

All tests run with vllm 0.17.1 + transformers 5.3.0:

  • Smoke test (scripts/test.sh): server starts, chat completions pass
截圖 2026-03-17 晚上9 35 28
  • Golden token test (test_paged_deterministic.py):
截圖 2026-03-17 晚上9 36 13

Signed-off-by: Chao-Ju Chen <ricky.chen@infinirc.com>
Copy link
Collaborator

@LxYuan0420 LxYuan0420 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ricky. I’m OK merging this as a temporary bridge.

Please follow up with a separate PR to bump mlx-lm / mlx-vlm minimums (and ideally a quick smoke run on a "needs Transformers v5" model) so the benefit is explicit. Once upstream vLLM lands the official v5 upgrade, we should drop the override in install.sh.

@ricky-chaoju
Copy link
Contributor Author

Thanks Ricky. I’m OK merging this as a temporary bridge.

Please follow up with a separate PR to bump mlx-lm / mlx-vlm minimums (and ideally a quick smoke run on a "needs Transformers v5" model) so the benefit is explicit. Once upstream vLLM lands the official v5 upgrade, we should drop the override in install.sh.

I opened: #174 (bump mlx-lm/mlx-vlm + Qwen3.5-0.8B smoke test)

@LxYuan0420 LxYuan0420 merged commit 68d53b3 into vllm-project:main Mar 18, 2026
5 checks passed
@ricky-chaoju ricky-chaoju deleted the deps/upgrade-transformers-5 branch March 18, 2026 09:25
LxYuan0420 pushed a commit that referenced this pull request Mar 19, 2026
## Summary
Follow-up to #169 (transformers>=5.0.0 upgrade).

- Bump `mlx>=0.31.0`, `mlx-lm>=0.31.0`, `mlx-vlm>=0.4.0`
- Add `tests/test_qwen35_smoke.py`: golden token comparison for
Qwen/Qwen3.5-0.8B (greedy decoding, 5/5 passed)

Qwen3.5 uses the `qwen3_5` architecture which requires
transformers>=5.0.0 and mlx-lm>=0.30.0. This proves the upgraded
dependency stack works end-to-end on Metal.

## Test

### test_qwen35_smoke
<img width="1087" height="628" alt="截圖 2026-03-18 下午5 09 56"
src="https://github.com/user-attachments/assets/f58f4883-1776-4f69-b2d1-10a1bad2938d"
/>

### test_paged_deterministic
<img width="1108" height="637" alt="截圖 2026-03-18 下午5 10 22"
src="https://github.com/user-attachments/assets/f501d7d0-714c-4b05-931f-00ce55444240"
/>

Signed-off-by: Chao-Ju Chen <ricky.chen@infinirc.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants