AutoResearch — CPU Edition

An autonomous AI researcher that runs on any computer. No GPU required.

Fork of karpathy/autoresearch. The original needs an H100. This runs on your laptop while you sleep.

val_bpb: 2.287 (baseline) → 2.226 (after autonomous tuning)

What it does

A small local LLM (Qwen 2.5 0.5B via prima.cpp server) suggests hyperparameter changes
train.py runs a 5-minute training experiment
If the result improves, it's committed automatically
Repeat — ~12 experiments per hour, ~100 while you sleep

step 00142 (100.0%) | loss: 2.226145 | epoch: 0 | remaining: 0s
---
val_bpb:          2.226000
training_seconds: 300.0
num_params_M:     0.8

Quick start

# 1. Install uv
curl -LsSf https://astral.sh/uv/install.sh | sh

# 2. Clone and install
git clone https://github.com/bopalvelut-prog/autoresearch.git
cd autoresearch && uv sync

# 3. Download data (one-time)
uv run prepare.py

# 4. Run a single experiment
uv run train.py

# 5. Start prima.cpp server (recommended)
prima-cli -m ~/.cache/autoresearch/qwen2.5-0.5b-instruct-q4_k_m.gguf \
  --port 8080 -c 2048 --threads 4

# 6. Let the agent run overnight
python agent.py

Works on Linux, macOS, Windows. Auto-detects CPU / Apple Silicon / NVIDIA GPU.

Note: Ollama is avoided — it's bloated (~2GB) and requires root. prima.cpp is lightweight (~150MB) and builds from source.

How it works

Three files:

File	Purpose
`prepare.py`	Data download, tokenizer, evaluation. Don't touch.
`train.py`	GPT model + optimizer + training loop. The agent edits this.
`program.md`	Instructions for the agent. You edit this.
`agent.py`	Autonomous research loop with prima.cpp + JSON logging.

All experiments use a fixed 5-minute time budget. The metric is val_bpb (validation bits per byte) — lower is better.

Results tracking

Every experiment is logged to:

results.tsv — flat TSV for quick viewing
results/run_*.json — structured JSON per run
results/experiments.csv — aggregate CSV for analysis

View your leaderboard:

uv run leaderboard.py --format md --top 10
uv run leaderboard.py --format json --export

Tuning for your hardware

The defaults are conservative (DEPTH=2, 0.8M params). For faster machines:

# In train.py:
DEPTH = 4              # More layers = better quality, slower
TOTAL_BATCH_SIZE = 2**15  # 32768 tokens
DEVICE_BATCH_SIZE = 8
WINDOW_PATTERN = "L"   # Full attention (faster on beefy CPUs)

For weaker hardware (phones, old laptops):

DEPTH = 1
TOTAL_BATCH_SIZE = 2**12  # 4096 tokens
MAX_SEQ_LEN = 128         # In prepare.py

Notable forks

Fork	Platform	Notes
miolini/autoresearch-macos	macOS	MPS optimized
jsegov/autoresearch-win-rtx	Windows	NVIDIA RTX

License

MIT. Built on karpathy/autoresearch.

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
agent.py		agent.py
analysis.ipynb		analysis.ipynb
auto_experiment.py		auto_experiment.py
chat_demo.py		chat_demo.py
chat_demo_medium.py		chat_demo_medium.py
check_stats.sh		check_stats.sh
devto.txt		devto.txt
experiment_log.txt		experiment_log.txt
forum_post.txt		forum_post.txt
hackernews.txt		hackernews.txt
leaderboard.py		leaderboard.py
linkedin.txt		linkedin.txt
other_platforms.txt		other_platforms.txt
prepare.py		prepare.py
program.md		program.md
progress.png		progress.png
promotion_checklist.md		promotion_checklist.md
pyproject.toml		pyproject.toml
quick_post.txt		quick_post.txt
reddit_post.md		reddit_post.md
reddit_reply.txt		reddit_reply.txt
run.log		run.log
run_agent.sh		run_agent.sh
run_overnight.py		run_overnight.py
short_post.md		short_post.md
stats_log.txt		stats_log.txt
train.py		train.py
train_95m.py		train_95m.py
tweet_networkchuck.txt		tweet_networkchuck.txt
tweet_thankyou.txt		tweet_thankyou.txt
tweet_thread.sh		tweet_thread.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoResearch — CPU Edition

What it does

Quick start

How it works

Results tracking

Tuning for your hardware

Notable forks

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AutoResearch — CPU Edition

What it does

Quick start

How it works

Results tracking

Tuning for your hardware

Notable forks

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages