Few-shot Learning
Teaching by example
Putting 2–5 examples (input → output pairs) into the prompt to teach the model your pattern — an 'in-context learning' technique that needs no training.
Two ways to prepare an LLM for a task: fine-tune (expensive, slow) or few-shot by stuffing a few examples in the prompt (instant, free). Modern LLMs are surprisingly good at few-shot — this is called in-context learning.
"Zero-shot" = no examples, just instruction. "One-shot" = 1 example. "Few-shot" = 2–5 examples. 3–5 is usually the sweet spot; more rarely helps.
Example format matters: consistent, short, and clearly demonstrates the target pattern. Even the order of examples can shift output.
A new intern: "write reports in this format" — you hand them 3 example reports. They produce the 4th correctly. No training, no class needed. Same with LLMs: 3 examples shown, 4th comes out right.
Task: convert product titles to SEO titles.
Prompt: ``` iPhone 16 Pro Max 256GB Black → Apple iPhone 16 Pro Max 256GB Black Smartphone Sony WH-1000XM5 Wireless Headphones → Sony WH-1000XM5 Wireless Noise-Cancelling Headphones Logitech MX Master 3S Mouse → Logitech MX Master 3S Wireless Ergonomic Mouse
Samsung Galaxy Watch 7 → ? ```
Model: "Samsung Galaxy Watch 7 Smartwatch Health Tracker" — pattern captured. A one-line instruction ("write SEO titles") wouldn't have done it; the examples taught the "brand, model, descriptor" format.
- Format is critical and instruction alone isn't enough
- Classification tasks — labeled examples nail the pattern fast
- Style or tone transfer (formal → casual, long → short)
- Quick prototyping with little data — before reaching for fine-tuning
- If the task is genuinely simple — try zero-shot first
- If you need hundreds of examples to teach the pattern — that's fine-tuning territory
- If your context window is already tight — examples crowd out the user content
Example selection bias
If your 3 spam examples are all English, the model labels Turkish spam as 'normal'. Diversify examples, include edge cases.
Order matters
Some models weight the last example more. Put the critical pattern last, or test different orderings.
Token budget
5 long examples = 2000 tokens. Adds cost and squeezes the context window. Pick short, dense examples.