Skip to content
View hussainnazary2's full-sized avatar
🏠
Working from home
🏠
Working from home

Organizations

@GGUFloader @local-ai-zone

Block or report hussainnazary2

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
hussainnazary2/README.md

I work on LLM inference at the engine and runtime level, focusing on performance, memory efficiency, and predictable behavior in production environments.

My experience includes optimizing inference across CPU and GPU backends, with hands-on use of CUDA, cuBLAS, cuBLASLt, and custom CUDA kernels for transformer workloads. I focus on practical improvements such as quantization-aware execution, efficient KV-cache management, memory allocation strategies, and optimized execution paths tailored to specific model architectures and hardware constraints.

I build and adapt local, cloud-independent inference systems, customizing runtimes for different model families and deployment requirements rather than relying on fixed abstractions. The goal is stable, efficient inference that makes full use of available hardware under real operational conditions.

Pinned Loading

  1. GGUFloader/gguf-loader GGUFloader/gguf-loader Public

    GGUF Loader with its Agentic Mode, and floating button, ai Models | Open Source & Offline. Mistral, Deepseek, llama, gemma, qwen

    Python 34 9

  2. local-ai-zone/local-ai-zone.github.io local-ai-zone/local-ai-zone.github.io Public

    Discover the Best AI Models for Your PC

    HTML 21 9

  3. LLM-Toolkit LLM-Toolkit Public

    Python 1

  4. GGUFloader/Mobile-AI-Assistant GGUFloader/Mobile-AI-Assistant Public

    lightweight, mobile-optimized AI assistant.

    Kotlin 1

  5. GPT-Calendar/smart-calendar GPT-Calendar/smart-calendar Public

    Smart Calendar – An AI-powered Android assistant combining voice commands, finance tracking, location-based reminders, and task management. Features "Kiro" wake word detection, SMS transaction pars…

    Kotlin