Learn how LLMs work.
Free visual guides to the concepts behind large language models, written by a working research lab. Each one explains a single idea in plain English, with the same hand-drawn diagrams as our flashcard deck.
-
FundamentalsWhat is a Transformer? A visual explanationThe architecture behind every modern LLM: self-attention, feed-forward layers, residual connections, and why it replaced recurrent networks.
-
FundamentalsHow attention works in LLMsQuery, key, and value explained as a tiny search engine per word, plus attention scores and what multi-head attention adds.
-
Building with LLMsRAG vs fine-tuning: when to use whichWhat each does, what they cost, and a clear framework for choosing between them, or using both together.
-
InferenceThe KV cache, explainedWhy generation needs it, how its memory grows with context length, and why it dominates the cost of long-context inference.
-
CareersHow to prepare for an LLM interviewThe topics that actually get asked, how to revise them efficiently with spaced repetition, and a two-week study plan.
Prefer the whole set as cards?
These guides draw on the LLM Flashcards: 180 hand-drawn cards covering the full LLM stack, as a PDF and an Anki deck.
See the deck →