Skip to content

Add position_ids to MptForCausalLM forward pass#44707

Open
saivedant169 wants to merge 1 commit intohuggingface:mainfrom
saivedant169:fix/issue-32937-mpt-position-ids
Open

Add position_ids to MptForCausalLM forward pass#44707
saivedant169 wants to merge 1 commit intohuggingface:mainfrom
saivedant169:fix/issue-32937-mpt-position-ids

Conversation

@saivedant169
Copy link

Fixes part of #32937

What does this PR do?

Adds position_ids as an explicit parameter to MptForCausalLM.forward() and MptModel.forward(), bringing MPT in line with other CausalLM models.

Same rationale as the Bloom PR (#44706) — MPT uses ALiBi so position_ids isn't consumed by the attention layer, but it should be in the forward signature rather than silently absorbed through **kwargs.

Part of the series:

How was it tested?

  • Full MPT test suite: 106 passed, 146 skipped, 0 failures
  • make style clean
python -m pytest tests/models/mpt/test_modeling_mpt.py -v

Coordination

Commented on #32937 here.

Threads position_ids through MptForCausalLM -> MptModel for API
consistency with other CausalLM models. MPT uses ALiBi, so position_ids
is accepted but not consumed by the attention layer — same pattern as
the Bloom PR.

Part of huggingface#32937
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: mpt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant