- π Overview
- π Scripts
- π Free API Providers
- π» Local Model Providers
- π API Proxies
- π Detailed Tool Guides
- π₯οΈ AI-Enhanced Terminals
This repository is your comprehensive guide to getting the most out of AI tools in your terminal. It contains curated scripts, expert tips, and detailed guides for terminal-based AI development.
π‘ Pro Tip: This is a companion to the awesome-terminals-ai listβyour one-stop resource for terminal AI tools!
Useful scripts to enhance your AI terminal workflow:
| Script | Description | Guide |
|---|---|---|
| π copilot-usage.sh | Check your GitHub Copilot usage and quota | Copilot CLI Guide |
| π€ run-claude-copilot.sh | Run Claude Code with GitHub Copilot models | See below β¬οΈ |
Access powerful Google Gemini models with generous free tier limits:
| Feature | Gemini 2.5 Pro (Free) | Gemini 2.5 Flash (Free) |
|---|---|---|
| β‘ Rate Limit | 2 requests/minute | 15 requests/minute |
| π Daily Limit | 50 requests/day | 1,500 requests/day |
- π Rate Limits Documentation
- π Create API Key
GitHub provides two types of AI model access for developers:
- π€ GitHub Copilot Models
- π GitHub Market Models
Overview:
- π Endpoint:
https://api.githubcopilot.com - π Documentation: Supported Models
- β‘ Rate Limits: see Individual Plan Comparison
Premium request limits (per month):
| Feature | GitHub Copilot Free | GitHub Copilot Pro | GitHub Copilot Pro+ |
|---|---|---|---|
| Premium requests | 0 per month | 300 per month | 1,500 per month |
βΉοΈ Exact limits and availability may change over timeβalways confirm via the official docs above.
Model multipliers:
- π Model Multipliers Documentation
- Models (accessible via API) with a 0Γ multiplier for non-free plans (not counted toward premium usage):
gpt-4.1,gpt-5-mini,gpt-4o
βοΈ Note: Some models need to be enabled at GitHub Copilot Features Settings before they become available for use.
β οΈ Integration Note: The endpointhttps://api.githubcopilot.comsupports OpenAI-compatible interface with GitHub OAuth Access Token (prefixed ingho_). However, the open-source proxy π Copilot API Bridge, authenticated with GitHub User Access Token (prefixed inghu_), provides both OpenAI and Anthropic compatible interfaces.
List available models:
curl -L \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${OAUTH_TOKEN}" \
https://api.githubcopilot.com/models | jq -r '.data[].id'Overview:
- π Endpoint:
https://models.github.ai/inference - π Browse: GitHub Marketplace Models
- π Rate Limits: 4k input tokens, 4k output tokens per request
List available models:
curl -L \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${OAUTH_TOKEN}" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://models.github.ai/catalog/models | jq -r '.[].id'OpenRouter provides unified API access to multiple AI modelsβtry different models using one API to find your best fit!
π Browse Free Models
| Model | Link |
|---|---|
| GPT OSS 20B | Try it |
| GPT OSS 120B | Try it |
| GLM 4.5 Air | Try it |
| Qwen3 Next 80B A3B Instruct | Try it |
Setup: π Generate API Key
π‘ Rate Limits:
- With 10+ credits purchased: 1,000 requests/day
- Otherwise: 50 requests/day
Groq offers high-speed inference with free tier access.
Available models from Rate Limits documentation:
openai/gpt-oss-120bopenai/gpt-oss-20bqwen/qwen3-32bmoonshotai/kimi-k2-instruct-0905
Setup: π Generate API Key
NVIDIA Build provides free API access to a wide selection of AI models optimized on NVIDIA infrastructure.
| Model | Full Model Name | Link |
|---|---|---|
| Qwen3.5 397B-A17B | qwen/qwen3.5-397b-a17b |
Try it |
| MiniMax M2.5 | minimaxai/minimax-m2.5 |
Try it |
| Kimi K2.5 | moonshotai/kimi-k2.5 |
Try it |
| GLM-5 | z-ai/glm5 |
Try it |
| GPT-OSS 120B | openai/gpt-oss-120b |
Try it |
| DeepSeek V3.2 | deepseek-ai/deepseek-v3_2 |
Try it |
Setup:
- π Generate API Key
- π Browse All Models
π‘ Note: Use the full model name (with namespace) when making API requests.
Ollama now provides cloud-hosted models via API access, offering powerful AI capabilities without the need for local infrastructure. These models are accessible through a simple API and integrate seamlessly with popular AI coding tools.
π° Pricing:
- π Free Plan - Available with hourly and daily usage limits
- π Pay-per-use - No upfront costs or hardware investment required
| Model | Full Name | Use Case |
|---|---|---|
| π» Qwen3.5 | qwen3.5 |
Multimodal models delivering exceptional utility and performance |
| π― MiniMax M2.5 | minimax-m2.5 |
Designed for real-world productivity and coding tasks |
| π Kimi K2.5 | kimi-k2.5 |
Multimodal agentic model with vision, language, and thinking modes |
| β‘ GLM 5 | glm-5 |
Strong reasoning and agentic model for complex systems engineering |
| π€ DeepSeek V3.2 | deepseek-v3.2 |
High computational efficiency with superior reasoning and agent performance |
| π₯ GPT-OSS 20B | gpt-oss:20b |
Powerful reasoning, agentic tasks, and versatile developer use cases |
| π GPT-OSS 120B | gpt-oss:120b |
Powerful reasoning, agentic tasks, and versatile developer use cases |
| π Gemini 3 Flash Preview | gemini-3-flash-preview |
Frontier intelligence built for speed at a fraction of the cost |
Ollama Cloud Models integrate seamlessly with popular AI coding tools and IDEs through native integrations and OpenAI-compatible APIs:
π― Supported AI Coding Tools & IDEs:
| Tool | Integration Type | Documentation |
|---|---|---|
| VS Code | Native Extension | View Guide |
| JetBrains | Native Plugin | View Guide |
| Codex | API Integration | View Guide |
| Cline | API Integration | View Guide |
| Droid | API Integration | View Guide |
| Goose | API Integration | View Guide |
| Zed | Native Extension | View Guide |
Key Benefits:
- OpenAI-compatible API - Use existing OpenAI client libraries
- Direct terminal integration - Run queries from command line
- No local setup required - Access powerful models via API
- Cost-effective - Pay-per-use without hardware investment
- Zero local storage - Models run in the cloud
Example API Usage:
# Query via REST API
curl https://api.ollama.ai/v1/chat/completions \
-H "Authorization: Bearer ${OLLAMA_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder:480b",
"messages": [
{"role": "user", "content": "Write a Python function to parse JSON"}
]
}'Setup:
- π Generate API Key
- π View Documentation
- π Free Plan Available - Includes hourly and daily usage limits
π‘ Pro Tip: Most integrations support both local and cloud models. For cloud models, append
-cloudto the model name in your tool's configuration.
Ollama - Lightweight framework for running LLMs locally via command line.
Key Features:
- β‘ Simple CLI interface
- π RESTful API
- π³ Docker-like model management
- π€ Popular models: LLaMA, Gemma, DeepSeek
- π OpenAI-compatible API
- π₯οΈ Cross-platform support
Ollama also provides access to cloud-hosted models via the ollama command. Simply append -cloud (or :cloud for some models) to the model name:
# Example: Run a cloud-hosted model
ollama run qwen3-coder:480b-cloudFor details, see the π¦ Ollama Cloud Models section.
Model Sizes:
| Model | Size |
|---|---|
| gpt-oss:120b | 65 GB |
| gpt-oss:20b | 13 GB |
| qwen3:8b | 5.2 GB |
| qwen3:30b | 18 GB |
Performance Benchmark (tokens/second):
| Machine | gpt-oss:120b | gpt-oss:20b | qwen3:8b | qwen3:30b |
|---|---|---|---|---|
| π₯οΈ Windows PC (Intel i9) | - | 15 t/s | 12 t/s | 22 t/s |
| π» MacBook Pro (M3 Max) | - | 70 t/s | 57 t/s | 74 t/s |
| π₯οΈ Linux Server (Dual RTX 4090) | 36 t/s | 156 t/s | 140 t/s | 163 t/s |
π Machine Specifications
-
Windows PC (Intel i9):
- CPU: Intel i9-12900
- GPU: Intel UHD Graphics 770 (2 GB)
- RAM: 64 GB
-
MacBook Pro (M3 Max):
- Apple M3 Max with 64 GB RAM
-
Linux Server (Dual RTX 4090):
- CPU: Xeon(R) w7-3445 (40 CPUs)
- GPU: 2 Γ Nvidia RTX 4090
- RAM: 128 GB
LM Studio - User-friendly desktop GUI for running local LLMs with no technical setup required.
Key Features:
- ποΈ Model marketplace
- π OpenAI-compatible API server
- π¬ Chat interface
- π¦ GGUF model support
- π° Free for personal & commercial use
Most AI tools support OpenAI-compatible APIs. For tools requiring Anthropic-compatible APIs, these solutions provide compatibility:
Claude Code Router - Routes Claude Code requests to different models with request customization.
π¦ Installation (Linux/macOS)
# Install Claude Code CLI (prerequisite)
npm install -g @anthropic-ai/claude-code
# Install Claude Code Router
npm install -g @musistudio/claude-code-routerβοΈ Configuration Examples
Create ~/.claude-code-router/config.json with your preferred providers:
{
"LOG": true,
"API_TIMEOUT_MS": 600000,
"Providers": [
{
"name": "gemini",
"api_base_url": "https://generativelanguage.googleapis.com/v1beta/models/",
"api_key": "$GEMINI_API_KEY",
"models": ["gemini-2.5-flash", "gemini-2.5-pro"],
"transformer": { "use": ["gemini"] }
},
{
"name": "openrouter",
"api_base_url": "https://openrouter.ai/api/v1/chat/completions",
"api_key": "$OPENROUTER_API_KEY",
"models": ["google/gemini-2.5-pro-preview", "anthropic/claude-sonnet-4"],
"transformer": { "use": ["openrouter"] }
},
{
"name": "grok",
"api_base_url": "https://api.x.ai/v1/chat/completions",
"api_key": "$GROK_API_KEY",
"models": ["grok-beta"]
},
{
"name": "github-copilot",
"api_base_url": "https://api.githubcopilot.com/chat/completions",
"api_key": "$GITHUB_TOKEN",
"models": ["gpt-4o", "claude-3-7-sonnet", "o1-preview"]
},
{
"name": "github-marketplace",
"api_base_url": "https://models.github.ai/inference/chat/completions",
"api_key": "$GITHUB_TOKEN",
"models": ["openai/gpt-4o", "openai/o1-preview", "xai/grok-3"]
},
{
"name": "ollama",
"api_base_url": "http://localhost:11434/v1/chat/completions",
"api_key": "ollama",
"models": ["qwen3:30b", "gpt-oss:20b", "llama3.2:latest"]
}
],
"Router": {
"default": "gemini,gemini-2.5-flash",
"background": "ollama,qwen3:30b",
"longContext": "openrouter,google/gemini-2.5-pro-preview"
}
}π» Usage Commands
# Start Claude Code with router
ccr code
# Use UI mode for configuration
ccr ui
# Restart after config changes
ccr restart
# Switch models dynamically in Claude Code
/model ollama,llama3.2:latest
β οΈ Known Issue: The proxy for Ollama models does not work properly with Claude Code.
The GitHub Copilot API (https://api.githubcopilot.com) supports OpenAI-compatible interface with GitHub OAuth Access Token (prefixed in gho_).
copilotβapi, an openβsource proxy authenticated with GitHub User Access Token (prefixed in ghu_),
provides the necessary bridge: it exposes an OpenAIβcompatible interface as well as an Anthropicβcompatible interface,
at the endpoint https://localhost:4141.
Installation and Authentication:
# Install copilot-api globally
npm install -g copilot-api
# Device authentication
copilot-api auth
# Start the API proxy
copilot-api startThe copilot-api tool is also available in specialized environments like the modern-linuxtools Singularity image on CVMFS.
CVMFS Setup:
# Setup the environment
source /cvmfs/atlas.sdcc.bnl.gov/users/yesw/singularity/alma9-x86/modern-linuxtools/setupMe.sh
# Then use copilot-api as normal
copilot-api auth
copilot-api startπ» Usage Examples
# Use with Aider
export ANTHROPIC_BASE_URL=http://localhost:4141 && aider --no-git --anthropic-api-key dummy --model anthropic/claude-sonnet-4.5
# Or use with Claude Code CLI
export ANTHROPIC_BASE_URL=http://localhost:4141 ANTHROPIC_AUTH_TOKEN=dummy ANTHROPIC_MODEL=claude-sonnet-4.5 && claude-codeπ Important Notes:
- Use your own URL in
ANTHROPIC_BASE_URLand remove trailing/- Enable X11 forwarding when SSH-ing:
ssh -X username@hostname- All GitHub Copilot models (excluding Market models) become accessible
For a streamlined experience, this script automates the entire setup process for using Claude Code with GitHub Copilot models.
β¨ Key Features:
| Feature | Description |
|---|---|
| π¦ Auto Dependency Management | Installs nvm, npm, copilot-api, and claude-code |
| β‘ Simplified Usage | Single command to start fully configured Claude session |
| π Model Selection | Specify which Copilot model to use |
| π οΈ Utility Functions | Check usage, list models, update packages |
| π Transparent Args | Forwards arguments directly to claude command |
π» Usage Examples:
# Run Claude with default settings
./scripts/run-claude-copilot.sh
# List available Copilot models
./scripts/run-claude-copilot.sh --list-models
# Check your Copilot API usage
./scripts/run-claude-copilot.sh --check-usage
# Run Claude with a specific model and pass a prompt
./scripts/run-claude-copilot.sh --model claude-sonnet-4 -- -p "Explain quantum computing"
# Get help on the script's options
./scripts/run-claude-copilot.sh --help
# Get help on Claude's own options
./scripts/run-claude-copilot.sh -- --helpComprehensive documentation for each AI terminal tool:
| Tool | Description | Guide |
|---|---|---|
| π€ Aider | AI pair programming in your terminal | Read Guide |
| π€ GitHub Copilot CLI | Copilot coding agent directly in your terminal | Read Guide |
| π Gemini CLI | Google's Gemini in your terminal | Read Guide |
| π Qwen Code | Qwen3-Coder models in your terminal | Read Guide |
AI-first terminal that integrates intelligent agents directly into the command line.
β¨ Key Features:
| Feature | Description |
|---|---|
| π¬ Natural Language Commands | Generate commands with # trigger |
| π€ Real-time AI | Autosuggestions and error detection |
| π€ Voice Commands | Multi-agent parallel workflows |
| π’ Enterprise Ready | SAML SSO, BYOL, zero data retention |
π Usage Limits:
- π Free tier: 150 requests/month
- π Paid plans available for higher usage
π¦ Installation:
brew install --cask warp # macOS
winget install Warp.Warp # Windows
# Linux - Multiple package formats available
# See: https://www.warp.dev/blog/warp-for-linux
# Packages include: .deb (apt), .rpm (yum/dnf/zypper), Snap, Flatpak, AppImage, and AUROpen-source terminal that brings graphical capabilities into the command line.
β¨ Key Features:
| Feature | Description |
|---|---|
| πΌοΈ Inline Previews | Images, markdown, CSV, video files |
| π VSCode-like Editor | Integrated editor for remote files |
| π Built-in Browser | Web browser and SSH connection manager |
| π Custom Widgets | Dashboard creation capabilities |
| π₯οΈ Cross-platform | Local data storage for privacy |
π€ AI Integration:
- β Built-in AI assistance for command suggestions
- βοΈ Configurable AI models via "Add AI preset..."
- π¦ Support for Ollama and other local models
- π― Context-aware recommendations
π¦ Installation:
Download from waveterm.dev/download
Available as: Snap, AppImage, .deb, .rpm, and Windows installers
Native AI integration for macOS's most popular terminal emulator.
β¨ Key Features:
- π§ Built-in AI Chat: Interact with LLMs directly within iTerm2 windows
- βοΈ Command Composer: Describe what you want to do in English, and it generates the shell command
- π Code Explanation: Highlight output or commands to get instant explanations
- π BYOK: Bring Your Own Key (OpenAI, Gemini, etc.) for privacy and control
π¦ Setup:
- Install iTerm2 (v3.5+)
- Install the AI plugin (Settings > General > AI > Install)
- Configure your provider (OpenAI, Gemini, etc.) and API key
- Start using the AI features:
- Cmd + Y: The Command Generator (The AI will generate the actual shell command but will not run it yet.)
- Cmd + Shift + Y: The AI Chat
- Use this for multi-turn conversations.
- If you grant permissions, this chat can read your current terminal history and error messages to provide context-aware answers.
AI-Powered, Non-Intrusive Terminal Assistant that works wherever tmux runs.
β¨ Key Features:
- π Universal Compatibility: Works with any terminal emulator via tmux
- π» Non-Intrusive: Runs in a separate pane or window, keeping your workflow clean
- π€ Model Flexibility: Supports OpenAI and other compatible APIs
- β¨οΈ Keyboard Centric: Designed for efficiency with tmux keybindings
π¦ Installation:
Prerequisite: tmux must be installed.
Follow instructions at github.com/alvinunreal/tmuxai
Made with β€οΈ by the Community
β Star on GitHub | π Report Issues | π‘ Contribute
Supercharge your terminal workflow! π