Skip to content
View carlosfundora's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report carlosfundora

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. llama.cpp-1-bit-turbo llama.cpp-1-bit-turbo Public

    Forked from ggml-org/llama.cpp

    HIP/ROCm fork of llama.cpp with PrismML Bonsai Q1_0 and Q1_0_G128 1-bit GPU inference, TurboQuant TQ3_0 KV cache, and gfx1030/RDNA2 hardening.

    C++ 3

  2. sglang-1-bit-turbo sglang-1-bit-turbo Public

    Forked from sgl-project/sglang

    ROCm/HIP fork of SGLang with TurboQuant tq2/tq3/tq4 KV cache, Triton and radix-cache serving, EAGLE3 speculative decoding, P-EAGLE checkpoint support, and PrismML Bonsai 1-bit GGUF compatibility on…

    Python 1

  3. SpecForge SpecForge Public

    Forked from sgl-project/SpecForge

    Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

    Python