Added NVIDIA NIM model support in interpreter/core/llm/llm.py.#1728
Added NVIDIA NIM model support in interpreter/core/llm/llm.py.#1728iam-saiteja wants to merge 4 commits intoopeninterpreter:mainfrom
interpreter/core/llm/llm.py.#1728Conversation
There was a problem hiding this comment.
Pull request overview
Adds first-class NVIDIA NIM support to the LLM routing layer, including shorthand model aliases, nvidia/... → nvidia_nim/... normalization, and default NVIDIA NIM configuration (API base, env-based API key lookup, and context/max token defaults).
Changes:
- Add NVIDIA NIM model alias resolution and
nvidia/prefix normalization. - Add NVIDIA NIM-specific
load()handling (defaultapi_base, env var API key lookup, context window +max_tokensdefaults). - Add documentation page + navigation entry for NVIDIA NIM.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| interpreter/core/llm/llm.py | Implements NVIDIA NIM routing/normalization and default settings in load() (plus alias handling in run()). |
| docs/mint.json | Adds NVIDIA NIM docs page to hosted providers nav. |
| docs/language-models/hosted-models/nvidia-nim.mdx | Documents usage, aliases, and required env vars for NVIDIA NIM. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| NVIDIA_MODEL_ALIASES = { | ||
| "llama-3.1-8b": "nvidia_nim/meta/llama-3.1-8b-instruct", | ||
| "llama-3.1-70b": "nvidia_nim/meta/llama-3.1-70b-instruct", | ||
| "llama-3.1-405b": "nvidia_nim/meta/llama-3.1-405b-instruct", | ||
| "llama-3.3-70b": "nvidia_nim/meta/llama-3.3-70b-instruct", | ||
| "llama-4-maverick": "nvidia_nim/meta/llama-4-maverick-17b-128e-instruct", | ||
| "nemotron-70b": "nvidia_nim/nvidia/llama-3.1-nemotron-70b-instruct", | ||
| "nemotron-ultra": "nvidia_nim/nvidia/llama-3.1-nemotron-ultra-253b-v1", | ||
| "nemotron-340b": "nvidia_nim/nvidia/nemotron-4-340b-instruct", | ||
| "deepseek-v3": "nvidia_nim/deepseek-ai/deepseek-v3.2", | ||
| "qwen3-coder": "nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct", | ||
| } | ||
| if model in NVIDIA_MODEL_ALIASES: | ||
| model = NVIDIA_MODEL_ALIASES[model] | ||
| self.model = model |
There was a problem hiding this comment.
NVIDIA_MODEL_ALIASES is defined inline inside run(); the same mapping is duplicated again in load(). This creates maintenance risk (the two copies can drift) and adds unnecessary per-call dict allocation. Consider moving the alias map (and any prefix normalization) to a single module-level constant/helper, and applying it in one place (typically load()).
|
|
||
| if self.context_window == None: | ||
| self.context_window = model_context_window | ||
| if self.max_tokens is None: |
There was a problem hiding this comment.
This file consistently uses == None / != None checks (e.g., self.context_window == None, self.max_tokens == None), but this new block uses is None for self.max_tokens. Please make the None-check style consistent within the file (either switch this to match the existing == None usage, or update the other occurrences to is None).
| if self.max_tokens is None: | |
| if self.max_tokens == None: |
| NVIDIA_MODEL_ALIASES = { | ||
| "llama-3.1-8b": "nvidia_nim/meta/llama-3.1-8b-instruct", | ||
| "llama-3.1-70b": "nvidia_nim/meta/llama-3.1-70b-instruct", | ||
| "llama-3.1-405b": "nvidia_nim/meta/llama-3.1-405b-instruct", | ||
| "llama-3.3-70b": "nvidia_nim/meta/llama-3.3-70b-instruct", | ||
| "llama-4-maverick": "nvidia_nim/meta/llama-4-maverick-17b-128e-instruct", | ||
| "nemotron-70b": "nvidia_nim/nvidia/llama-3.1-nemotron-70b-instruct", | ||
| "nemotron-ultra": "nvidia_nim/nvidia/llama-3.1-nemotron-ultra-253b-v1", | ||
| "nemotron-340b": "nvidia_nim/nvidia/nemotron-4-340b-instruct", | ||
| "deepseek-v3": "nvidia_nim/deepseek-ai/deepseek-v3.2", | ||
| "qwen3-coder": "nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct", | ||
| } | ||
| if self.model in NVIDIA_MODEL_ALIASES: | ||
| self.model = NVIDIA_MODEL_ALIASES[self.model] | ||
|
|
There was a problem hiding this comment.
NVIDIA_MODEL_ALIASES is duplicated here and in run(). To avoid future drift (e.g., docs or alias updates only made in one place), consider defining the mapping once (module-level constant) and reusing it from both call sites, or removing one of the two normalization paths.
( only |
- Move NVIDIA_MODEL_ALIASES to module-level constant (no duplication) - Add NVIDIA_CONTEXT_WINDOWS and NVIDIA_NIM_API_BASE constants - Create _get_nvidia_api_key() helper function - Fix API base to only set for nvidia_nim/ models - Make None checks consistent (== None style) - Optimize performance by eliminating dict recreation
All Issues Fixed1. Duplicated NVIDIA_MODEL_ALIASES
2. Inconsistent None-check style
3. API base conditional logic
Additional optimizations:
|
PEP8 says "comparisons to singletons like None should always be done with is or is not, never the equality operators." |
Change == None to is None for singleton comparisons as per PEP8
|
Thanks for catching that! I've fixed the None comparisons in the NVIDIA block to use |
Describe the changes you have made:
llama-3.1-8b,llama-3.1-70b,nemotron-340b,deepseek-v3, andqwen3-codernvidia/...tonvidia_nim/...load()api_basetohttps://integrate.api.nvidia.com/v1NVIDIA_API_KEYorNVIDIA_NIM_API_KEYmax_tokensfor NVIDIA NIM models using existingllm.pyconventionsValidation
Tested locally on Windows with:
llama-3.1-8bnvidia/meta/llama-3.1-8b-instructnvidia_nim/meta/llama-3.1-8b-instructAll three resolved successfully to
nvidia_nim/meta/llama-3.1-8b-instructand returned a valid response.Reference any relevant issues (e.g. "Fixes #000"):
NA
Pre-Submission Checklist (optional but appreciated):
docs/CONTRIBUTING.mddocs/ROADMAP.mdOS Tests (optional but appreciated):