Skip to content
#

vllm-serve

Here are 35 public repositories matching this topic...

agentsculptor is an experimental AI-powered development agent designed to analyze, refactor, and extend Python projects automatically. It uses an OpenAI-like planner–executor loop on top of a vLLM backend, combining project context analysis, structured tool calls, and iterative refinement. It has only been tested with gpt-oss-120b via vLLM.

  • Updated Sep 17, 2025
  • Python

This is vllm multi tenant large language model gateway. This system is created to serve lot of requests at same time to lot of users. It uses vllm as it's engine to run llm, it has scheduler to schedule the queries of users and limiter to limit the use of specific user. It also uses LoRA adapters in vllm.

  • Updated Mar 5, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the vllm-serve topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vllm-serve topic, visit your repo's landing page and select "manage topics."

Learn more