Learn how LLMs work.

Free visual guides to the concepts behind large language models, written by a working research lab. Each one explains a single idea in plain English, with the same visual diagrams as our flashcards.

Fundamentals

What is a Transformer? A visual explanation

The architecture behind every modern LLM: self-attention, feed-forward layers, residual connections, and why it replaced recurrent networks.
Fundamentals

How attention works in LLMs

Query, key, and value explained as a tiny search engine per word, plus attention scores and what multi-head attention adds.
Building with LLMs

RAG vs fine-tuning: when to use which

What each does, what they cost, and a clear framework for choosing between them, or using both together.
Inference

The KV cache, explained

Why generation needs it, how its memory grows with context length, and why it dominates the cost of long-context inference.
Careers

How to prepare for an LLM interview

The topics that actually get asked, how to revise them efficiently with spaced repetition, and a two-week study plan.

Prefer the whole set as cards?

These guides draw on the LLM Flashcards: 330+ visual cards covering the full LLM stack, as a PDF and an Anki set.

See the cards →