why use many token when few do trick
Before/After β’ Install β’ Levels β’ Skills β’ Benchmarks β’ Evals
A Claude Code skill/plugin and Codex plugin that makes agent talk like caveman β cutting ~75% of output tokens while keeping full technical accuracy. Now with ζθ¨ζ mode, terse commits, one-line code reviews, and a compression tool that cuts ~45% of input tokens every session.
Based on the viral observation that caveman-speak dramatically reduces LLM token usage without losing technical substance. So we made it a one-line install.
|
|
|
|
Same fix. 75% less word. Brain still big.
Pick your level of grunt:
|
|
|
|
Same answer. You pick how many word.
βββββββββββββββββββββββββββββββββββββββ
β TOKENS SAVED ββββββββ 75% β
β TECHNICAL ACCURACY ββββββββ 100%β
β SPEED INCREASE ββββββββ ~3x β
β VIBES ββββββββ OOG β
βββββββββββββββββββββββββββββββββββββββ
- Faster response β less token to generate = speed go brrr
- Easier to read β no wall of text, just the answer
- Same accuracy β all technical info kept, only fluff removed (science say so)
- Save money β ~71% less output token = less cost
- Fun β every code review become comedy
Install as a plugin β includes skills + auto-loading hooks (caveman activates every session, mode badge tracks /caveman ultra etc.):
claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@cavemannpx skills add JuliusBrussee/cavemanFor a specific agent: npx skills add JuliusBrussee/caveman -a cursor
Note
npx skills installs skills only (no hooks). For Claude Code auto-loading hooks, use the plugin install above or run bash hooks/install.sh.
- Clone repo β Open Codex in repo β
/pluginsβ SearchCavemanβ Install
Note
Windows Codex users: Clone repo β VS Code β Codex Settings β Plugins β find Caveman under local marketplace β Install β Reload Window. Also enable git config core.symlinks true before cloning (requires developer mode or admin).
Install once. Use in all sessions after that. One rock. That it.
Add a [CAVEMAN:ULTRA] badge to your statusline showing which mode is active. See hooks/README.md for the snippet.
Trigger with:
/cavemanor Codex$caveman- "talk like caveman"
- "caveman mode"
- "less tokens please"
Stop with: "stop caveman" or "normal mode"
| Level | Trigger | What it do |
|---|---|---|
| Lite | /caveman lite |
Drop filler, keep grammar. Professional but no fluff |
| Full | /caveman full |
Default caveman. Drop articles, fragments, full grunt |
| Ultra | /caveman ultra |
Maximum compression. Telegraphic. Abbreviate everything |
Classical Chinese literary compression β same technical accuracy, but in the most token-efficient written language humans ever invented.
| Level | Trigger | What it do |
|---|---|---|
| Wenyan-Lite | /caveman wenyan-lite |
Semi-classical. Grammar intact, filler gone |
| Wenyan-Full | /caveman wenyan |
Full ζθ¨ζ. Maximum classical terseness |
| Wenyan-Ultra | /caveman wenyan-ultra |
Extreme. Ancient scholar on a budget |
Level stick until you change it or session end.
| Skill | What it do | Trigger |
|---|---|---|
| caveman-commit | Terse commit messages. Conventional Commits. β€50 char subject. Why over what. | /caveman-commit |
| caveman-review | One-line PR comments: L42: π΄ bug: user null. Add guard. No throat-clearing. |
/caveman-review |
Caveman make Claude speak with fewer tokens. Compress make Claude read fewer tokens.
Your CLAUDE.md loads on every session start. Caveman Compress rewrites memory files into caveman-speak so Claude reads less β without you losing the human-readable original.
/caveman:compress CLAUDE.md
CLAUDE.md β compressed (Claude reads this every session β fewer tokens)
CLAUDE.original.md β human-readable backup (you read and edit this)
| File | Original | Compressed | Saved |
|---|---|---|---|
claude-md-preferences.md |
706 | 285 | 59.6% |
project-notes.md |
1145 | 535 | 53.3% |
claude-md-project.md |
1122 | 687 | 38.8% |
todo-list.md |
627 | 388 | 38.1% |
mixed-with-code.md |
888 | 574 | 35.4% |
| Average | 898 | 494 | 45% |
Code blocks, URLs, file paths, commands, headings, dates, version numbers β anything technical passes through untouched. Only prose gets compressed. See the full caveman-compress README for details. Security note: Snyk flags this as High Risk due to subprocess/file patterns β it's a false positive.
Real token counts from the Claude API (reproduce it yourself):
| Task | Normal (tokens) | Caveman (tokens) | Saved |
|---|---|---|---|
| Explain React re-render bug | 1180 | 159 | 87% |
| Fix auth middleware token expiry | 704 | 121 | 83% |
| Set up PostgreSQL connection pool | 2347 | 380 | 84% |
| Explain git rebase vs merge | 702 | 292 | 58% |
| Refactor callback to async/await | 387 | 301 | 22% |
| Architecture: microservices vs monolith | 446 | 310 | 30% |
| Review PR for security issues | 678 | 398 | 41% |
| Docker multi-stage build | 1042 | 290 | 72% |
| Debug PostgreSQL race condition | 1200 | 232 | 81% |
| Implement React error boundary | 3454 | 456 | 87% |
| Average | 1214 | 294 | 65% |
Range: 22%β87% savings across prompts.
Important
Caveman only affects output tokens β thinking/reasoning tokens are untouched. Caveman no make brain smaller. Caveman make mouth smaller. Biggest win is readability and speed, cost savings are a bonus.
A March 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models" found that constraining large models to brief responses improved accuracy by 26 percentage points on certain benchmarks and completely reversed performance hierarchies. Verbose not always better. Sometimes less word = more correct.
Caveman not just claim 75%. Caveman prove it.
The evals/ directory has a three-arm eval harness that measures real token compression against a proper control β not just "verbose vs skill" but "terse vs skill". Because comparing caveman to verbose Claude conflate the skill with generic terseness. That cheating. Caveman not cheat.
# Run the eval (needs claude CLI)
uv run python evals/llm_run.py
# Read results (no API key, runs offline)
uv run --with tiktoken python evals/measure.pySnapshots committed to git. CI runs free. Every number change reviewable as diff. Add a skill, add a prompt β harness pick it up automatically.
If caveman save you mass token, mass money β leave mass star. β
- Cavekit β specification-driven development for Claude Code. Caveman language β specs β parallel builds β working software.
- Revu β local-first macOS study app with FSRS spaced repetition, decks, exams, and study guides. revu.cards
MIT β free like mass mammoth on open plain.
