Pinned Loading
-
llama.cpp-1-bit-turbo
llama.cpp-1-bit-turbo PublicForked from ggml-org/llama.cpp
HIP/ROCm fork of llama.cpp with PrismML Bonsai Q1_0 and Q1_0_G128 1-bit GPU inference, TurboQuant TQ3_0 KV cache, and gfx1030/RDNA2 hardening.
C++ 3
-
sglang-1-bit-turbo
sglang-1-bit-turbo PublicForked from sgl-project/sglang
ROCm/HIP fork of SGLang with TurboQuant tq2/tq3/tq4 KV cache, Triton and radix-cache serving, EAGLE3 speculative decoding, P-EAGLE checkpoint support, and PrismML Bonsai 1-bit GGUF compatibility on…
Python 1
-
SpecForge
SpecForge PublicForked from sgl-project/SpecForge
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Python
If the problem persists, check the GitHub status page or contact support.



