Conversation
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Code Review
This pull request aims to update the transformers library to version 5. The changes correctly update the version in requirements/test.in and requirements/nightly_torch_test.txt, and also add the --pre flag to uv pip install in the Dockerfile to allow installation of the release candidate. However, there is a critical oversight: requirements/common.txt still contains a constraint transformers < 5. This will lead to build failures for any configuration that relies on common.txt. This file must be updated to allow transformers v5 for this PR to be mergeable.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
Documentation preview: https://vllm--30566.org.readthedocs.build/en/30566/ |
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
### What does this PR do? Refer to vllm-project/vllm#30566 for all the patched needed for Transformers v5 This PR is a Transformers v5 compatibility sweep plus guardrails for token ID shape consistency. - Remove the hard <5.0.0 block by changing dependency pinning in requirements.txt. - Add a single compat resolver get_auto_model_for_vision2seq() in transformers_compat.py to handle AutoModelForVision2Seq vs AutoModelForImageTextToText, and switch model-loading/registration codepaths to use that resolver instead of direct imports. - Introduce normalize_token_ids(...) in tokenizer.py, which normalizes apply_chat_template(tokenize=True) outputs to flat list[int] across v4/v5 return-shape differences. ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward`, `fully_async`, `one_step_off` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [X] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`. --------- Signed-off-by: Hollow Man <hollowman@opensuse.org>
### What does this PR do? Refer to vllm-project/vllm#30566 for all the patched needed for Transformers v5 This PR is a Transformers v5 compatibility sweep plus guardrails for token ID shape consistency. - Remove the hard <5.0.0 block by changing dependency pinning in requirements.txt. - Add a single compat resolver get_auto_model_for_vision2seq() in transformers_compat.py to handle AutoModelForVision2Seq vs AutoModelForImageTextToText, and switch model-loading/registration codepaths to use that resolver instead of direct imports. - Introduce normalize_token_ids(...) in tokenizer.py, which normalizes apply_chat_template(tokenize=True) outputs to flat list[int] across v4/v5 return-shape differences. ### Checklist Before Starting - [X] Search for similar PRs. Paste at least one query link here: ... - [X] Format the PR title as `[{modules}] {type}: {description}` (This will be checked by the CI) - `{modules}` include `fsdp`, `megatron`, `veomni`, `sglang`, `vllm`, `rollout`, `trainer`, `ci`, `training_utils`, `recipe`, `hardware`, `deployment`, `ray`, `worker`, `single_controller`, `misc`, `perf`, `model`, `algo`, `env`, `tool`, `ckpt`, `doc`, `data`, `cfg`, `reward`, `fully_async`, `one_step_off` - If this PR involves multiple modules, separate them with `,` like `[megatron, fsdp, doc]` - `{type}` is in `feat`, `fix`, `refactor`, `chore`, `test` - If this PR breaks any API (CLI arguments, config, function signature, etc.), add `[BREAKING]` to the beginning of the title. - Example: `[BREAKING][fsdp, megatron] feat: dynamic batching` ### Test > For changes that can not be tested by CI (e.g., algorithm implementation, new model support), validate by experiment(s) and show results like training curve plots, evaluation results, etc. ### API and Usage Example > Demonstrate how the API changes if any, and provide usage example(s) if possible. ```python # Add code snippet or script demonstrating how to use this ``` ### Design & Code Changes > Demonstrate the high-level design if this PR is complex, and list the specific changes. ### Checklist Before Submitting > [!IMPORTANT] > Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review. - [X] Read the [Contribute Guide](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md). - [X] Apply [pre-commit checks](https://github.com/volcengine/verl/blob/main/CONTRIBUTING.md#code-linting-and-formatting): `pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always` - [X] Add / Update [the documentation](https://github.com/volcengine/verl/tree/main/docs). - [X] Add unit or end-to-end test(s) to [the CI workflow](https://github.com/volcengine/verl/tree/main/.github/workflows) to cover all the code. If not feasible, explain why: ... - [X] Once your PR is ready for CI, send a message in [the `ci-request` channel](https://verl-project.slack.com/archives/C091TCESWB1) in [the `verl` Slack workspace](https://join.slack.com/t/verl-project/shared_invite/zt-3855yhg8g-CTkqXu~hKojPCmo7k_yXTQ). (If not accessible, please try [the Feishu group (飞书群)](https://applink.larkoffice.com/client/chat/chatter/add_by_link?link_token=772jd4f1-cd91-441e-a820-498c6614126a).) - [X] If your PR is related to the `recipe` submodule, please also update the reference to the submodule commit via `git submodule update --remote` or `cd recipe && git pull origin main`. --------- Signed-off-by: Hollow Man <hollowman@opensuse.org>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
Hi @hmellor, the pre-commit checks have failed. Please run: uv pip install pre-commit
pre-commit install
pre-commit run --all-filesThen, commit the changes and push to your branch. For future commits, Tip Is
|
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
This pull request has merge conflicts that must be resolved before it can be |
## Summary Upgrade transformers from >=4.40.0 to >=5.0.0. mlx-lm 0.30+ and mlx-vlm 0.3.10+ require transformers 5.x for newer model architectures (Qwen3.5,Nemotron, etc.). vllm upstream is tracking the official upgrade in vllm-project/vllm#30566 but it hasn't landed yet — this unblocks the MLX side first. ## Test results All tests run with vllm 0.17.1 + transformers 5.3.0: - Smoke test (`scripts/test.sh`): server starts, chat completions pass <img width="1119" height="596" alt="截圖 2026-03-17 晚上9 35 28" src="https://github.com/user-attachments/assets/bfb57897-7bf2-411d-bff3-dc54c81d59ec" /> - Golden token test (`test_paged_deterministic.py`): <img width="1109" height="572" alt="截圖 2026-03-17 晚上9 36 13" src="https://github.com/user-attachments/assets/8a98de01-7739-4b81-820f-9bd1a2942ba8" /> Signed-off-by: Chao-Ju Chen <ricky.chen@infinirc.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Changes:
5.x.y0.22.2(as is required by Transformers5.0.0)0.18.1so that huggingface/peft@41c07f0 is included (guards import ofHybridCacheon Transformers version)1.1.0so that 4-bit bnb can work on Transformers v52.3.0so that state-spaces/mamba@35e927b is included (removes import that was deleted in Transformers v5)HF_HUB_DOWNLOAD_TIMEOUT=60to the CI environment to deal with the shortened timeout inhuggingface-hub>=1since it switched tohttpx4.57.5installedArchitectures/models that will no longer work after the upgrade:
Plamo2ForCausalLM- Custom model code uses_tied_weight_keys: list[str]but Transformers v5 now expects_tied_weight_keys: dict[str, str]InternS1ForConditionalGeneration- Custom tokenizer code is not compatible with Transformers v5MiniCPMO- Custom processor code is not compatible with Transformers v5MiniCPMV- Custom processing code on the Hub is incompatible with Transformers v5 (PR made but unmerged)OpenCUAForConditionalGeneration- Custom code is not compatible with Transformers v5OpenPanguVLForConditionalGeneration- OpenPanguVLVideoProcessorInitKwargs does not specify total=False, making all kwargs requiredOvis2_5- Custom processor code is not compatible with Transformers v5Ovis2_6_MoeForCausalLM- Custom processor code is not compatible with Transformers v5Caution
30d8b3d must be reverted before this can be merged
Supplementary PRs:
Transformers:
First 10 (click here to see them)
getattrinstandardize_rope_paramsbecauserope_parametersnot always present huggingface/transformers#42593RotaryEmbeddingConfigMixinhuggingface/transformers#42517validation_fnto beNoneinvalidate_ropehuggingface/transformers#42601rope_parametersto emptydictif there is something to put in it huggingface/transformers#42651torch.autocastif it will have an effect huggingface/transformers#42747pad_token_idhuggingface/transformers#43453Second 10 (click here to see them)
tied_weight_keysin-place huggingface/transformers#43619convert_rope_params_to_dictso it usesrope_thetafrom the config huggingface/transformers#43766Jamba] Fallback to slow path and warn instead of error out huggingface/transformers#43889Third 10 (click here to see them)
Mamba] Fix kernel loading huggingface/transformers#44176from_dictbackward compatibility with old remote code huggingface/transformers#44245Fourth 10 (click here to see them)
dtypefor subconfig when_from_confighuggingface/transformers#44629supports_{tp/pp}_planhuggingface/transformers#44696set_encoderhuggingface/transformers#44698Fifth N (click here to see them)
vLLM:
First 10 (click here to see them)
--rope-scalingand--rope-theta#28006rope_scalingtorope_parametersin preparation for Transformers v5 #28542partial_rotary_factorfromrope_parameters#29966get_ropeto userope_parameters["partial_rotary_factor"], notrotary_dim#30389Second 10 (click here to see them)
httpxlogger less annoying when Transformers v5 is installed #30480head_maskfrom Ultravox and Swin #30764HfHubHTTPErrorin LoRA test #30768position_embedding_typewill be present for BERT and RoBERTa models #30770WeightRenamingfor Transformers modeling backend #31545min_pixels/max_pixelsfrom Qwen2VL's processor #33208tie_word_embeddingsfor multimodal models in Transformers v5 #33359return_dictforapply_chat_template#33372Third 10 (click here to see them)
lm-evalversion for Transformers v5 compatibility #33994mamba-ssmversion in CI for Transformers v5 compatibility #34233Fourth 10 (click here to see them)
Fifth 10 (click here to see them)
padding_indexfrom models that don't use it for better Transformers v5 compatibility #35189hf_override_fnwhen it modifiesmodel_type#35200inputs_embedslike Gemma 3 #36787ExaoneMoeMTPtest that never ran in Transformers v4 #36792Sixth 10 (click here to see them)
UltraVox] Fix output type #37224layer_type_validationfor Transformers v5 #37398Seventh N (click here to see them)
Model repos:
Merged - 7 (click here to see them)
Unmerged - 15 (click here to see them)
Other:
modify_gen_kwargsinvllm_vlms.pyEleutherAI/lm-evaluation-harness#3573