Skip to content

Fix several based models' pipeline parallel support#44699

Open
hmellor wants to merge 2 commits intohuggingface:mainfrom
hmellor:fix-qwen2-pp
Open

Fix several based models' pipeline parallel support#44699
hmellor wants to merge 2 commits intohuggingface:mainfrom
hmellor:fix-qwen2-pp

Conversation

@hmellor
Copy link
Member

@hmellor hmellor commented Mar 14, 2026

These models have base_model_pp_plans but currently do not work because the base model's forward pass depends on all the layers being Qwen2VLDecoderLayer.

i.e. if one of the layers is removed/replaced with Identity, decoder_layer.attention_type will fail with an attribute error.

This PR fixes the issue in /src/transformers/models/qwen2_vl/modeling_qwen2_vl.py and runs python utils/modular_model_converter.py to propagate it to the other models inherit the behaviour.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
@hmellor hmellor changed the title Fix Qwen2-VL based models' pipeline parallel support Fix several based models' pipeline parallel support Mar 14, 2026
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: afmoe, bamba, cohere2, cwm, dots1, gemma2, gemma3, gemma3n, gpt_oss, granitemoehybrid, lfm2, lfm2_moe, llama4, minimax, ministral

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants