Skip to content

feat(cli): implement dynamic model discovery for NVIDIA#8820

Draft
Varuu-0 wants to merge 3 commits intoKilo-Org:mainfrom
Varuu-0:feat/nvidia-dynamic-discovery
Draft

feat(cli): implement dynamic model discovery for NVIDIA#8820
Varuu-0 wants to merge 3 commits intoKilo-Org:mainfrom
Varuu-0:feat/nvidia-dynamic-discovery

Conversation

@Varuu-0
Copy link
Copy Markdown
Contributor

@Varuu-0 Varuu-0 commented Apr 13, 2026

PR: feat(cli): implement dynamic model discovery for NVIDIA

Context

Previously, the NVIDIA provider relied on a static snapshot from model.dev, which was frequently incomplete and required manual updates. This PR introduces dynamic model discovery by querying NVIDIA's integration API directly, ensuring all available NVIDIA NIMs are immediately accessible. Additionally, all models on build.nvidia.com are now explicitly marked as free for development, with a note on the 40req/min rate limit.

Implementation

  • Dynamic Discovery: Added an nvidia provider in packages/opencode/src/provider/provider.ts that implements discoverModels.
  • API Integration: Fetches from https://integrate.api.nvidia.com/v1/models using native fetch with a 10s timeout and optimized User-Agent.
  • Cost & Status: Sets isFree: true and zero cost for all NVIDIA models to reflect their development status on the NVIDIA platform.
  • Rate Limits: Configured awareness that these models typically carry a 40req/min rate limit on the build platform.
  • Metadata Resilience: Merges dynamic discovery with existing static snapshot metadata when available to preserve known model details.

Implementation Comparison

Before

N/A: The NVIDIA provider was not explicitly defined in the dynamic discovery mapping and relied entirely on static snapshots.

After

// kilocode_change start - NVIDIA dynamic model discovery
nvidia: async (input) => {
  return {
    autoload: true,
    options: input.options,
    async discoverModels(): Promise<Record<string, Model>> {
      try {
        const response = await fetch("https://integrate.api.nvidia.com/v1/models", {
          headers: { "User-Agent": Installation.USER_AGENT },
          signal: AbortSignal.timeout(10000),
        })

        if (!response.ok) return {}

        const { data } = (await response.json()) as {
          data: Array<{ id: string; owned_by: string; created: number }>
        }
        const models: Record<string, Model> = {}

        for (const item of data) {
          if (input.models[item.id]) {
            models[item.id] = {
              ...input.models[item.id],
              id: ModelID.make(item.id),
              providerID: ProviderID.make("nvidia"),
              cost: { input: 0, output: 0, cache: { read: 0, write: 0 } },
              isFree: true,
            }
          } else {
            models[item.id] = {
              id: ModelID.make(item.id),
              providerID: ProviderID.make("nvidia"),
              name: item.id.split("/")[1]?.replace(/-/g, " ") || item.id,
              family: item.owned_by,
              status: "beta",
              isFree: true,
              cost: { input: 0, output: 0, cache: { read: 0, write: 0 } },
              limit: { context: 128000, output: 4096 },
              capabilities: {
                temperature: true,
                reasoning: item.id.includes("thinking"),
                attachment: item.id.includes("vision") || item.id.includes("vl"),
                toolcall: true,
                input: {
                  text: true,
                  image: item.id.includes("vl") || item.id.includes("vision"),
                  audio: false,
                  video: false,
                  pdf: false,
                },
                output: { text: true, image: false, audio: false, video: false, pdf: false },
                interleaved: false,
              },
              release_date: new Date(item.created * 1000).toISOString().split("T")[0],
              api: { id: item.id, url: "https://integrate.api.nvidia.com/v1", npm: "@ai-sdk/openai-compatible" },
              headers: {},
              options: {},
              variants: {},
            }
          }
        }

        return models
      } catch {
        return {}
      }
    },
  }
},
// kilocode_change end

How to Test

  1. Run kilo models and verify that NVIDIA models are dynamically listed.
  2. Verify that models with "thinking" or "vision" in their ID have the correct capabilities enabled.
  3. Inspect model details to confirm isFree: true and zero cost are correctly assigned.
  4. Select an NVIDIA model (e.g. nvidia/llama-3.1-405b-instruct) and verify it utilizes the @ai-sdk/openai-compatible adapter.

Get in Touch

@varuu

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

return {
autoload: true,
options: input.options,
async discoverModels(): Promise<Record<string, Model>> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: discoverModels is never invoked for NVIDIA

state() only executes discoveryLoaders for gitlab in packages/opencode/src/provider/provider.ts:1305, so this callback is registered but never run. As written, the provider will still expose only the static snapshot and none of the dynamically discovered NVIDIA models from this PR will ever be added.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot bot commented Apr 13, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (2 files)
  • packages/kilo-docs/source-links.md
  • packages/opencode/src/provider/provider.ts

Reviewed by gpt-5.4-20260305 · 796,134 tokens

@Varuu-0 Varuu-0 marked this pull request as draft April 13, 2026 06:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant