[DRAFT] LLM Compressor integration by idoudali · Pull Request #2299 · oumi-ai/oumi

idoudali · 2026-03-23T15:52:48Z

Description

Related issues

Fixes # (issue)

Before submitting

This PR only changes documentation. (You can ignore the following checks in that case)
Did you read the contributor guideline Pull Request guidelines?
Did you link the issue(s) related to this PR in the section above?
Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

Summary by Gitar

Major integration change:
- Replaced AWQ quantization backend with LLM Compressor (vLLM project), supporting FP8, GPTQ, AWQ, and W-bit schemes
- Deprecated awq_quantizer.py; added llmcompressor_quantizer.py with oneshot API integration
Configuration overhaul:
- Changed from method (string) to backend + scheme + algorithm (enums) in QuantizationConfig
- Added calibration dataset control, ignore-layers, and algorithm selection
CLI updates:
- Split --method into --backend, --scheme, --algorithm flags
- Updated example model IDs and quantization examples
Tests:
- Removed AWQ tests; added LLM Compressor and constants tests
- Updated builder, BNB, and base quantization tests for new enum-based API

_{This will update automatically on new commits.}

idoudali · 2026-03-25T16:14:55Z

-    method: str = "awq_q4_0"
-    """Quantization method. AWQ methods (awq_q4_0, awq_q8_0) provide best quality.
-    Direct GGUF methods (q4_0, q8_0) for llama.cpp. Precision methods (f16, f32)."""
+    backend: str | None = None


we would like not to expose the backend

The customer cares more about the format and not the backend being use

Keep the naming convention as is for now and infert the backend form the format. use a prefix like "bnb_" for the old case to avoid introducing more changes

idoudali requested a review from oelachqar March 24, 2026 08:17

idoudali commented Mar 25, 2026

View reviewed changes

Ioannis Doudalis added 2 commits March 30, 2026 09:53

LLM Compressor integration

d92eb90

fixup! LLM Compressor integration

4a09ae3

idoudali force-pushed the idoudali/quantization branch from 015d0f6 to 4a09ae3 Compare March 30, 2026 14:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DRAFT] LLM Compressor integration#2299

[DRAFT] LLM Compressor integration#2299
idoudali wants to merge 2 commits intomainfrom
idoudali/quantization

idoudali commented Mar 23, 2026 •

edited by gitar-bot bot

Loading

Uh oh!

idoudali Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

idoudali commented Mar 23, 2026 • edited by gitar-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues

Before submitting

Reviewers

Summary by Gitar

Uh oh!

idoudali Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

idoudali commented Mar 23, 2026 •

edited by gitar-bot bot

Loading