All terms
Training & adaptation

LoRA (Low-Rank Adaptation)

Also known as: low-rank adaptation, LoRA adapter, parameter-efficient fine-tuning, PEFT

A parameter-efficient fine-tuning method that trains tiny adapter layers and freezes the base model, cutting fine-tuning cost by 100x or more.

What it means

LoRA's insight is that the weight changes from fine-tuning are usually low-rank — meaning they can be approximated by the product of two much smaller matrices. So instead of updating a 70-billion-parameter base model, you freeze it and train two thin adapter matrices per attention layer that nudge the model's behavior. Those adapters typically have well under 1% the parameters of the base model. The savings are enormous. You can fine-tune a 70B model on a single 80GB GPU instead of a multi-node cluster. The resulting LoRA adapter is often under 100MB, so you can store hundreds of task-specific adapters and hot-swap them at inference time over the same base model. Companies that need many specialized variants — one per customer, one per language, one per domain — basically can't operate without LoRA. LoRA works well enough that "full fine-tuning" is now a niche choice mostly used in pre-training continuation or when you're deliberately reshaping the model's core behavior. For everything else — instruction tuning, domain adaptation, style transfer, character personas — LoRA or a close variant (QLoRA, DoRA) is the default. Note: this is the model-tuning LoRA. The image-generation community uses "LoRA" to mean the same technique applied to Stable Diffusion / Flux to teach a style or character — same underlying math, different ecosystem.

Example

An e-commerce platform trains 200 LoRA adapters — one per merchant — on top of a single Llama 4 70B base model. Each adapter teaches the model that merchant's product catalog and tone. At inference, they load the right adapter on the fly with no full model copy needed.

Why it matters

LoRA is why fine-tuning went from a frontier-lab activity to something a startup can do on a weekend. It collapsed the cost of model customization by two orders of magnitude and made multi-tenant model serving practical. Anyone shipping a custom-trained model in 2026 is almost certainly shipping a LoRA.

Related terms

See it in a comparison