Skip to content

Add batch size and gradient accumulation parameters to quantization s…#2456

Open
xin3he wants to merge 1 commit intomasterfrom
xinhe/4-28
Open

Add batch size and gradient accumulation parameters to quantization s…#2456
xin3he wants to merge 1 commit intomasterfrom
xinhe/4-28

Conversation

@xin3he
Copy link
Copy Markdown
Contributor

@xin3he xin3he commented Apr 28, 2026

Type of Change

example update

Description

Reduce peak memory to suit low memory GPU

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

how to reproduce the test (including hardware information)

Dependency Change?

any library dependency introduced or removed

…cripts

Signed-off-by: Xin He <xin3.he@intel.com>
@xin3he
Copy link
Copy Markdown
Contributor Author

xin3he commented Apr 28, 2026

bash run_quant.sh --topology=Llama-3.3-70B --dtype=mxfp4_mixed --input_model=meta-llama/Llama-3.3-70B-Instruct --output_model=./Llama-3.3-70B_mxfp4_mixed --gradient_accumulate_steps=2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant