Skip to content

Added NVIDIA NIM model support in interpreter/core/llm/llm.py.#1728

Open
iam-saiteja wants to merge 4 commits intoopeninterpreter:mainfrom
iam-saiteja:main
Open

Added NVIDIA NIM model support in interpreter/core/llm/llm.py.#1728
iam-saiteja wants to merge 4 commits intoopeninterpreter:mainfrom
iam-saiteja:main

Conversation

@iam-saiteja
Copy link
Copy Markdown
Contributor

Describe the changes you have made:

  • Added NVIDIA model aliases such as llama-3.1-8b, llama-3.1-70b, nemotron-340b, deepseek-v3, and qwen3-coder
  • Added normalization from nvidia/... to nvidia_nim/...
  • Added NVIDIA NIM provider handling in load()
  • Set default api_base to https://integrate.api.nvidia.com/v1
  • Added env-based API key lookup via NVIDIA_API_KEY or NVIDIA_NIM_API_KEY
  • Added context window defaults for common NVIDIA NIM models, with fallback behavior for unknown models
  • Bounded max_tokens for NVIDIA NIM models using existing llm.py conventions

Validation

Tested locally on Windows with:

  • alias model: llama-3.1-8b
  • normalized prefix: nvidia/meta/llama-3.1-8b-instruct
  • direct provider model: nvidia_nim/meta/llama-3.1-8b-instruct

All three resolved successfully to nvidia_nim/meta/llama-3.1-8b-instruct and returned a valid response.

Reference any relevant issues (e.g. "Fixes #000"):

NA

Pre-Submission Checklist (optional but appreciated):

  • I have included relevant documentation updates (stored in /docs)
  • I have read docs/CONTRIBUTING.md
  • I have read docs/ROADMAP.md

OS Tests (optional but appreciated):

  • Tested on Windows
  • Tested on MacOS
  • Tested on Linux

Copilot AI review requested due to automatic review settings April 15, 2026 16:28
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class NVIDIA NIM support to the LLM routing layer, including shorthand model aliases, nvidia/...nvidia_nim/... normalization, and default NVIDIA NIM configuration (API base, env-based API key lookup, and context/max token defaults).

Changes:

  • Add NVIDIA NIM model alias resolution and nvidia/ prefix normalization.
  • Add NVIDIA NIM-specific load() handling (default api_base, env var API key lookup, context window + max_tokens defaults).
  • Add documentation page + navigation entry for NVIDIA NIM.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
interpreter/core/llm/llm.py Implements NVIDIA NIM routing/normalization and default settings in load() (plus alias handling in run()).
docs/mint.json Adds NVIDIA NIM docs page to hosted providers nav.
docs/language-models/hosted-models/nvidia-nim.mdx Documents usage, aliases, and required env vars for NVIDIA NIM.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread interpreter/core/llm/llm.py Outdated
Comment on lines +120 to +134
NVIDIA_MODEL_ALIASES = {
"llama-3.1-8b": "nvidia_nim/meta/llama-3.1-8b-instruct",
"llama-3.1-70b": "nvidia_nim/meta/llama-3.1-70b-instruct",
"llama-3.1-405b": "nvidia_nim/meta/llama-3.1-405b-instruct",
"llama-3.3-70b": "nvidia_nim/meta/llama-3.3-70b-instruct",
"llama-4-maverick": "nvidia_nim/meta/llama-4-maverick-17b-128e-instruct",
"nemotron-70b": "nvidia_nim/nvidia/llama-3.1-nemotron-70b-instruct",
"nemotron-ultra": "nvidia_nim/nvidia/llama-3.1-nemotron-ultra-253b-v1",
"nemotron-340b": "nvidia_nim/nvidia/nemotron-4-340b-instruct",
"deepseek-v3": "nvidia_nim/deepseek-ai/deepseek-v3.2",
"qwen3-coder": "nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct",
}
if model in NVIDIA_MODEL_ALIASES:
model = NVIDIA_MODEL_ALIASES[model]
self.model = model
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVIDIA_MODEL_ALIASES is defined inline inside run(); the same mapping is duplicated again in load(). This creates maintenance risk (the two copies can drift) and adds unnecessary per-call dict allocation. Consider moving the alias map (and any prefix normalization) to a single module-level constant/helper, and applying it in one place (typically load()).

Copilot uses AI. Check for mistakes.

if self.context_window == None:
self.context_window = model_context_window
if self.max_tokens is None:
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file consistently uses == None / != None checks (e.g., self.context_window == None, self.max_tokens == None), but this new block uses is None for self.max_tokens. Please make the None-check style consistent within the file (either switch this to match the existing == None usage, or update the other occurrences to is None).

Suggested change
if self.max_tokens is None:
if self.max_tokens == None:

Copilot uses AI. Check for mistakes.
Comment thread interpreter/core/llm/llm.py Outdated
Comment on lines +360 to +374
NVIDIA_MODEL_ALIASES = {
"llama-3.1-8b": "nvidia_nim/meta/llama-3.1-8b-instruct",
"llama-3.1-70b": "nvidia_nim/meta/llama-3.1-70b-instruct",
"llama-3.1-405b": "nvidia_nim/meta/llama-3.1-405b-instruct",
"llama-3.3-70b": "nvidia_nim/meta/llama-3.3-70b-instruct",
"llama-4-maverick": "nvidia_nim/meta/llama-4-maverick-17b-128e-instruct",
"nemotron-70b": "nvidia_nim/nvidia/llama-3.1-nemotron-70b-instruct",
"nemotron-ultra": "nvidia_nim/nvidia/llama-3.1-nemotron-ultra-253b-v1",
"nemotron-340b": "nvidia_nim/nvidia/nemotron-4-340b-instruct",
"deepseek-v3": "nvidia_nim/deepseek-ai/deepseek-v3.2",
"qwen3-coder": "nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct",
}
if self.model in NVIDIA_MODEL_ALIASES:
self.model = NVIDIA_MODEL_ALIASES[self.model]

Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVIDIA_MODEL_ALIASES is duplicated here and in run(). To avoid future drift (e.g., docs or alias updates only made in one place), consider defining the mapping once (module-level constant) and reusing it from both call sites, or removing one of the two normalization paths.

Copilot uses AI. Check for mistakes.
@endolith
Copy link
Copy Markdown
Contributor

  • Set default api_base to https://integrate.api.nvidia.com/v1

( only if self.model.startswith("nvidia_nim/"):)

- Move NVIDIA_MODEL_ALIASES to module-level constant (no duplication)
- Add NVIDIA_CONTEXT_WINDOWS and NVIDIA_NIM_API_BASE constants
- Create _get_nvidia_api_key() helper function
- Fix API base to only set for nvidia_nim/ models
- Make None checks consistent (== None style)
- Optimize performance by eliminating dict recreation
@iam-saiteja
Copy link
Copy Markdown
Contributor Author

All Issues Fixed

1. Duplicated NVIDIA_MODEL_ALIASES

  • Moved to single module-level constant NVIDIA_MODEL_ALIASES
  • Removed duplicate definitions from both run() and load() methods
  • Used by both methods via the same constant

2. Inconsistent None-check style

  • Changed is None to == None in NVIDIA block to match file style
  • All None checks now consistent throughout the file

3. API base conditional logic

  • Fixed: if self.model.startswith("nvidia_nim/") and not self.api_base:
  • Now only sets https://integrate.api.nvidia.com/v1 for nvidia_nim/ models
  • Won't affect other providers

Additional optimizations:

  • Added NVIDIA_CONTEXT_WINDOWS module constant (eliminates dict recreation)
  • Added NVIDIA_NIM_API_BASE constant for maintainability
  • Created _get_nvidia_api_key() helper function
  • All changes tested and verified working

@endolith
Copy link
Copy Markdown
Contributor

Changed is None to == None in NVIDIA block to match file style

PEP8 says "comparisons to singletons like None should always be done with is or is not, never the equality operators."

Change == None to is None for singleton comparisons as per PEP8
@iam-saiteja
Copy link
Copy Markdown
Contributor Author

Thanks for catching that! I've fixed the None comparisons in the NVIDIA block to use is None instead of == None to follow PEP8 guidelines. The changes are now pushed to the branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants