Working with AI

How AI works

The mechanism under the hood — tokens, embeddings, transformers, context windows, agents. For the curious operator who wants to understand, not just use.

Recommended reading order

If you are new to AI, read these in order. The first guide is the 30-minute starter path that links to every other piece. The rest go deeper on the questions everyone hits early: which model to use, what AI gets wrong, what is safe to share at work, and how to prompt.

01

How large language models work

The mental model that fixes most prompting confusion — prediction, training, inference, why hallucinations happen, why prompt phrasing matters so much. For the operator who wants to understand the mechanism.

10 min read
02

What is a token in AI? The unit that controls cost and output

Tokens are how models read, generate, and bill. The mental model, why output costs more than input, why your AI bill is bigger than expected, and the 7 levers to cut cost without breaking quality.

7 min read
03

Context windows explained: what they limit, what they do not

How much text a model can consider at once, what counts against the window, why quality degrades long before you hit the limit (the "lost in the middle" effect), and how to budget context in production.

8 min read
04

Embeddings explained: how AI represents meaning as numbers

The mechanism behind semantic search, RAG, classification, and recommendations. What embeddings are, what they capture, what they miss (negation, exact match, fine-grained logic), and how to choose a model.

8 min read
05

Transformers and attention: the architecture under every modern AI model

The intuition behind attention (without the math), why transformers scaled when RNNs did not, encoder vs decoder, why context length is bounded, and what comes after pure attention (Mamba, MoE, hybrids).

9 min read
06

Fine-tuning vs RAG vs prompting: which one fits your problem

The three ways to make a model behave better for your case — cost, persistence, updateability, when to use each, and when to mix them. With the decision matrix and the math for "is fine-tuning worth it."

9 min read
07

How AI agents work (and where they break)

The minimum that makes something an agent (LLM + tools + loop). What agents are good for, the six predictable failure modes, the autonomy spectrum, multi-agent vs single, and what to log in production.

10 min read
08

How to evaluate an LLM feature is working (without fooling yourself)

Why "looks good" is not evaluation. Building a small eval set (20 cases beats 200), the four grading methods (programmatic, reference, LLM-as-judge, human), what to measure, and how to spot production drift.

9 min read
When you are done here, go to

Prompt craft

The foundations make this section come alive. Patterns for writing prompts that work, given what you now know about how the model behaves.