Few-shot Learning — Explained

Definition

Two ways to prepare an LLM for a task: fine-tune (expensive, slow) or few-shot by stuffing a few examples in the prompt (instant, free). Modern LLMs are surprisingly good at few-shot — this is called in-context learning.

"Zero-shot" = no examples, just instruction. "One-shot" = 1 example. "Few-shot" = 2–5 examples. 3–5 is usually the sweet spot; more rarely helps.

Example format matters: consistent, short, and clearly demonstrates the target pattern. Even the order of examples can shift output.

Analogy

A new intern: "write reports in this format" — you hand them 3 example reports. They produce the 4th correctly. No training, no class needed. Same with LLMs: 3 examples shown, 4th comes out right.

Real-world example

Task: convert product titles to SEO titles.

Prompt: ``` iPhone 16 Pro Max 256GB Black → Apple iPhone 16 Pro Max 256GB Black Smartphone Sony WH-1000XM5 Wireless Headphones → Sony WH-1000XM5 Wireless Noise-Cancelling Headphones Logitech MX Master 3S Mouse → Logitech MX Master 3S Wireless Ergonomic Mouse

Samsung Galaxy Watch 7 → ? ```

Model: "Samsung Galaxy Watch 7 Smartwatch Health Tracker" — pattern captured. A one-line instruction ("write SEO titles") wouldn't have done it; the examples taught the "brand, model, descriptor" format.

When to use

Format is critical and instruction alone isn't enough
Classification tasks — labeled examples nail the pattern fast
Style or tone transfer (formal → casual, long → short)
Quick prototyping with little data — before reaching for fine-tuning

When not to use

If the task is genuinely simple — try zero-shot first
If you need hundreds of examples to teach the pattern — that's fine-tuning territory
If your context window is already tight — examples crowd out the user content

Common pitfalls

Example selection bias

If your 3 spam examples are all English, the model labels Turkish spam as 'normal'. Diversify examples, include edge cases.

Order matters

Some models weight the last example more. Put the critical pattern last, or test different orderings.

Token budget

5 long examples = 2000 tokens. Adds cost and squeezes the context window. Pick short, dense examples.