carlosfundora

Follow

🎯

Focusing

Carlos Fundora carlosfundora

🎯

Focusing

Follow

5 followers · 22 following

New orleans, LA
23:15 (UTC -12:00)
@CarlosFundora

Achievements

Achievements

Pinned Loading

llama.cpp-1-bit-turbo llama.cpp-1-bit-turbo Public

Forked from ggml-org/llama.cpp

HIP/ROCm fork of llama.cpp with PrismML Bonsai Q1_0 and Q1_0_G128 1-bit GPU inference, TurboQuant TQ3_0 KV cache, and gfx1030/RDNA2 hardening.

C++ 3
sglang-1-bit-turbo sglang-1-bit-turbo Public

Forked from sgl-project/sglang

ROCm/HIP fork of SGLang with TurboQuant tq2/tq3/tq4 KV cache, Triton and radix-cache serving, EAGLE3 speculative decoding, P-EAGLE checkpoint support, and PrismML Bonsai 1-bit GGUF compatibility on…

Python 1
SpecForge SpecForge Public

Forked from sgl-project/SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python