How AI works

The mechanism under the hood — tokens, embeddings, transformers, context windows, agents. For the curious operator who wants to understand, not just use.

How large language models work

The mental model that fixes most prompting confusion — prediction, training, inference, why hallucinations happen, why prompt phrasing matters so much. For the operator who wants to understand the mechanism.

10 min read

What is a token in AI? The unit that controls cost and output

Tokens are how models read, generate, and bill. The mental model, why output costs more than input, why your AI bill is bigger than expected, and the 7 levers to cut cost without breaking quality.

7 min read

Context windows explained: what they limit, what they do not

How much text a model can consider at once, what counts against the window, why quality degrades long before you hit the limit (the "lost in the middle" effect), and how to budget context in production.

8 min read

Embeddings explained: how AI represents meaning as numbers

The mechanism behind semantic search, RAG, classification, and recommendations. What embeddings are, what they capture, what they miss (negation, exact match, fine-grained logic), and how to choose a model.

8 min read

Transformers and attention: the architecture under every modern AI model

The intuition behind attention (without the math), why transformers scaled when RNNs did not, encoder vs decoder, why context length is bounded, and what comes after pure attention (Mamba, MoE, hybrids).

9 min read

Fine-tuning vs RAG vs prompting: which one fits your problem

The three ways to make a model behave better for your case — cost, persistence, updateability, when to use each, and when to mix them. With the decision matrix and the math for "is fine-tuning worth it."

9 min read

How AI agents work (and where they break)

The minimum that makes something an agent (LLM + tools + loop). What agents are good for, the six predictable failure modes, the autonomy spectrum, multi-agent vs single, and what to log in production.

10 min read

How to evaluate an LLM feature is working (without fooling yourself)

Why "looks good" is not evaluation. Building a small eval set (20 cases beats 200), the four grading methods (programmatic, reference, LLM-as-judge, human), what to measure, and how to spot production drift.

9 min read

When you are done here, go to

Use AI for work

Apply what you learned about how models work to specific tasks: email, meetings, research, weekly reports.