Skip to content

Add new deploy model cases 0426#1392

Merged
kevalmorabia97 merged 6 commits intoNVIDIA:mainfrom
noeyy-mino:add-new-deploy-model-cases-0426
May 5, 2026
Merged

Add new deploy model cases 0426#1392
kevalmorabia97 merged 6 commits intoNVIDIA:mainfrom
noeyy-mino:add-new-deploy-model-cases-0426

Conversation

@nvSiruiW
Copy link
Copy Markdown
Contributor

@nvSiruiW nvSiruiW commented May 5, 2026

Add new deploy model cases

Summary by CodeRabbit

  • Dependencies

    • Updated Transformers dependency constraint to >=4.56,<5.6 for improved compatibility with newer versions while maintaining support for stable releases
  • Tests

    • Expanded deployment test coverage across multiple model architectures including Qwen, Gemma, Nemotron, and emerging models
    • Added new test scenarios for deployment validation to enhance overall coverage

@nvSiruiW nvSiruiW requested a review from a team as a code owner May 5, 2026 02:45
@nvSiruiW nvSiruiW requested a review from kevalmorabia97 May 5, 2026 02:45
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 5, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 5, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: aa4f55a5-c2eb-4fc2-99eb-444aab74f03f

📥 Commits

Reviewing files that changed from the base of the PR and between 70546bd and 01baba7.

📒 Files selected for processing (2)
  • pyproject.toml
  • tests/examples/llm_ptq/test_deploy.py

📝 Walkthrough

Walkthrough

This PR narrows the Transformers dependency constraint in the hf optional extra from >=4.56 to >=4.56,<5.6, and expands test coverage in the LLM PTQ deployment test suite by adding two new test functions (test_glm, test_minimax) and updating model parametrizations for Qwen, Gemma, Nemotron, and Eagle3.

Changes

Dependency Constraint Update

Layer / File(s) Summary
Version Bounds
pyproject.toml
Transformers dependency is restricted to >=4.56,<5.6 in the hf optional extra to exclude Transformers 5.6+.

Test Parametrization Expansion

Layer / File(s) Summary
Model Parametrizations
tests/examples/llm_ptq/test_deploy.py
test_qwen, test_gemma, test_llama_nemotron, and test_eagle parametrizations are updated with new or replacement model entries (Qwen3-VL-235B, Gemma-4-31B-IT, Nemotron-3-Super-120B, Kimi K2/K2.5 Eagle3 variants) with adjusted tensor parallelism and attention backend settings.
New Test Functions
tests/examples/llm_ptq/test_deploy.py
test_glm() and test_minimax() functions are added, each parametrizing deployments for NVIDIA GLM and MiniMax models with multi-backend support (trtllm, vllm, sglang) and TP=8.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~18 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Add new deploy model cases 0426' is vague and generic, using a date stamp instead of describing the specific models or changes being added. Use a more descriptive title that specifies the primary change, such as 'Add deployment test cases for Qwen3-VL, Gemma-4, and Nemotron models' or similar.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Security Anti-Patterns ✅ Passed No security anti-patterns detected in the PR changes. Transformers is Apache 2.0 licensed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread pyproject.toml Outdated
Comment thread pyproject.toml Outdated
Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 75.37%. Comparing base (70546bd) to head (0f45031).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1392      +/-   ##
==========================================
- Coverage   76.58%   75.37%   -1.22%     
==========================================
  Files         471      476       +5     
  Lines       50565    51404     +839     
==========================================
+ Hits        38726    38744      +18     
- Misses      11839    12660     +821     
Flag Coverage Δ
unit 52.31% <ø> (-0.49%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kevalmorabia97 kevalmorabia97 merged commit d1ed76d into NVIDIA:main May 5, 2026
21 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants