LLM101n: Let's build a Storyteller

permalink	/
nav_exclude	true

LLM101n: Let's build a Storyteller

What I cannot create, I do not understand. — Richard Feynman

In this course, we build a Storyteller AI Large Language Model (LLM) from scratch — from a one-line bigram model all the way to a deployed, multimodal web app. Everything is implemented end-to-end in Python with minimal prerequisites. By the end you will have a deep, hands-on understanding of how modern LLMs work.

The training corpus throughout is TinyStories — a dataset of short children's stories — keeping experiments fast enough to run on a laptop while still producing meaningful results.

Syllabus

#	Chapter	Key concepts
01	Bigram Language Model	language modeling, NLL loss, character-level tokenization
02	Micrograd	scalar autodiff, backpropagation from scratch
03	N-gram MLP	multi-layer perceptron, matmul, GELU
04	Attention	self-attention, softmax, positional encoding
05	Transformer	GPT-2 architecture, residual connections, LayerNorm
06	Tokenization	Byte Pair Encoding (BPE), minBPE
07	Optimization	weight initialization, AdamW, LR schedules
08	Need for Speed I: Device	CPU vs GPU, device-agnostic PyTorch
09	Need for Speed II: Precision	mixed precision, fp16, bf16, fp8
10	Need for Speed III: Distributed	DDP, ZeRO, DeepSpeed
11	Datasets	data loading, synthetic data generation
12	Inference I: KV-Cache	key-value cache, autoregressive generation
13	Inference II: Quantization	INT8/INT4 quantization
14	Finetuning I: SFT	supervised finetuning, PEFT, LoRA, chat format
15	Finetuning II: RL	RLHF, PPO, DPO
16	Deployment	FastAPI server, streaming, web UI
17	Multimodal	VQVAE, diffusion transformer, image+text

Repository Layout

LLM101n/
├── chNN.md          # Chapter narratives + embedded Python code (kept in sync via inject.py)
├── codes/
│   ├── inject.py    # Syncs named blocks from codes/chNN/main.py → chNN.md
│   ├── extract.py   # Legacy: extracted markdown → main.py (no longer the active workflow)
│   ├── chNN/
│   │   ├── main.py  # Runnable script — SOURCE OF TRUTH for code
│   │   └── run.log  # Expected output
│   └── data/        # Shared datasets, checkpoints, tokenizers
└── llm101n.jpg

codes/chNN/main.py is the source of truth. Edit the Python scripts directly; run inject.py to sync changes back into the markdown chapter files.

Getting Started

Prerequisites

Python 3.10+
A GPU is helpful but not required for the early chapters

Setup

git clone https://github.com/bagustris/LLM101n.git
cd LLM101n

# Create and activate the virtual environment
uv venv codes/.venv
source codes/.venv/bin/activate   # Windows: codes\.venv\Scripts\activate

uv pip install torch datasets transformers tqdm fastapi uvicorn

Running a chapter

source codes/.venv/bin/activate
cd codes/ch01
python main.py

Each chapter is self-contained. Chapter 01 downloads the TinyStories dataset on the first run and saves it to codes/data/ so subsequent chapters can reuse it without hitting the network again.

Editing code and syncing to markdown

codes/chNN/main.py is the source of truth. Edit it directly, then sync named blocks back into the chapter markdown:

cd codes
python inject.py            # sync all chapters
python inject.py ch05       # sync one chapter
python inject.py --dry-run  # preview diffs without writing
python inject.py --status   # show which blocks are marked

Note on block markers: In main.py, wrap editable sections with # === block: <name> === / # === /block: <name> ===. In the markdown, wrap the matching fence with  / . inject.py will replace only those fenced regions.

Shared Data (`codes/data/`)

File / Directory	Description
`tinystories_train.txt`	50 K training stories
`tinystories_val.txt`	5 K validation stories
`gpt_tinystories.pt`	Pretrained GPT-2 checkpoint on TinyStories
`tinystories_bpe_tokenizer.json`	BPE tokenizer (128-token vocab)
`lora_adapter/`	Saved LoRA adapter (rank=8, alpha=16)
`vqvae_cifar10.pt`	Pretrained VQVAE for CIFAR-10 (ch17)
`cifar-10-batches-py/`	CIFAR-10 image dataset (ch17)
`server.py`	FastAPI streaming text-generation server (ch16)
`frontend.html`	Web UI (ch16)
`Dockerfile`	Container for the deployed server (ch16)
`ds_config.json`	DeepSpeed config for distributed training (ch10)

Appendix — Topics to Explore Further

Programming languages: Assembly, C, Python internals
Data types: Integer, Float, String (ASCII, Unicode, UTF-8)
Tensors: shapes, views, strides, contiguous memory
Frameworks: PyTorch, JAX
Architectures: GPT-1/2/3/4, Llama (RoPE, RMSNorm, GQA), Mixture-of-Experts
Multimodal: Images, Audio, Video, VQVAE, VQGAN, Diffusion models

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
codes		codes
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
_config.yml		_config.yml
ch01.md		ch01.md
ch02.md		ch02.md
ch03.md		ch03.md
ch04.md		ch04.md
ch05.md		ch05.md
ch06.md		ch06.md
ch07.md		ch07.md
ch08.md		ch08.md
ch09.md		ch09.md
ch10.md		ch10.md
ch11.md		ch11.md
ch12.md		ch12.md
ch13.md		ch13.md
ch14.md		ch14.md
ch15.md		ch15.md
ch16.md		ch16.md
ch17.md		ch17.md
llm101n.jpg		llm101n.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM101n: Let's build a Storyteller

Syllabus

Repository Layout

Getting Started

Prerequisites

Setup

Running a chapter

Editing code and syncing to markdown

Shared Data (`codes/data/`)

Appendix — Topics to Explore Further

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM101n: Let's build a Storyteller

Syllabus

Repository Layout

Getting Started

Prerequisites

Setup

Running a chapter

Editing code and syncing to markdown

Shared Data (codes/data/)

Appendix — Topics to Explore Further

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Shared Data (`codes/data/`)

Packages