Few-Shot Prompting: How to Teach AI by Example

Cosmin IaruMarch 3, 202616 min read

few-shot promptingprompt engineeringprompt techniquesguide

You can spend ten minutes describing what you want from an AI — the tone, the format, the level of detail, the structure — and still get something that misses the mark. Or you can show it two examples of what "good" looks like, and it nails it on the first try.

That's few-shot prompting. Instead of explaining, you demonstrate. And it's one of the most reliable techniques in prompt engineering because it works the way humans learn: by pattern recognition.

What Is Few-Shot Prompting?

Few-shot prompting means including a small number of input/output examples in your prompt before asking the AI to handle a new input. The model identifies the pattern from your examples and applies it to the new case.

The terminology comes from machine learning research:

Term	Meaning	Examples in Prompt
Zero-shot	No examples — just the instruction	0
One-shot	One example before the real task	1
Few-shot	Two to five examples before the real task	2–5

Here's the simplest possible few-shot prompt:

Convert these company descriptions into taglines:

Company: Stripe — Online payment processing for internet businesses.
Tagline: "Payments infrastructure for the internet"

Company: Notion — All-in-one workspace for notes, tasks, and docs.
Tagline: "The connected workspace"

Company: Linear — Issue tracking for software teams.
Tagline:

The AI sees the pattern — short, punchy, captures the essence without buzzwords — and produces something like: "Issue tracking that moves at the speed of your team."

You didn't have to explain "be concise," "don't use jargon," or "focus on the core value proposition." The examples communicated all of that implicitly.

Why Few-Shot Works So Well

Large language models are, at their core, pattern completion machines. They predict what comes next based on everything that came before. When you put examples in the prompt, you're not "training" the model — you're shaping the context it uses for prediction.

This is why few-shot prompting is so effective:

1. Patterns are unambiguous. The instruction "write in a casual, engaging tone" means different things to different people (and different AI models). But showing two paragraphs in the tone you want leaves no room for interpretation.

2. Format is demonstrated, not described. Trying to explain a complex output format in words is painful. Showing one example of the exact format is instant clarity.

3. Quality level is calibrated. Your examples set the bar. High-quality examples produce high-quality output. The AI matches the effort level it sees.

4. Edge cases are handled. If your examples include an unusual case (an empty input, a special character, a boundary condition), the AI learns to handle similar cases without you explicitly coding rules for them.

The Anatomy of a Good Few-Shot Prompt

Every effective few-shot prompt has four parts:

[1. System instruction — what role and task]
[2. Example input → Example output] × 2-5
[3. New input]
[4. Output trigger — signal for the AI to respond]

Let's build one step by step.

Step 1: System Instruction

Set the context briefly. Don't over-explain — the examples will do the heavy lifting.

Classify customer support tickets by urgency (critical, high, medium, low).

Step 2: Examples

Choose examples that cover the range of expected inputs. Diversity matters more than quantity.

Ticket: "Our entire team can't log in. Production is down."
Urgency: critical

Ticket: "The export to PDF feature is showing the wrong date format."
Urgency: medium

Ticket: "Can you add dark mode? Would be a nice quality of life improvement."
Urgency: low

Step 3: New Input

Ticket: "Payments are failing for about 30% of our customers since this morning."
Urgency:

Step 4: The AI Completes

The model outputs: critical — because it learned from the examples that system-wide issues affecting core functionality are critical, and 30% payment failures clearly fits that pattern.

How Many Examples Do You Need?

The research and practical testing both point to the same answer: 2–5 examples hit the sweet spot.

Count	When to Use	Trade-off
1 example	Simple format demonstrations	Risky — model might overfit to the single example's quirks
2 examples	Most tasks — establishes a pattern without using too many tokens	Minimum for reliable pattern detection
3 examples	Tasks with categories or variable outputs	Covers the range well
5 examples	Complex classification, nuanced tone, or when accuracy is critical	Diminishing returns beyond this
10+ examples	Almost never worth it	Eats context window, rarely improves quality over 5

The key rule: examples should be diverse, not repetitive. Three examples of positive sentiment classification teach the model less than one positive, one negative, and one ambiguous example.

8 Practical Few-Shot Templates

1. Tone Matching

You want the AI to write in your brand's specific voice. Describing tone is subjective — showing it is precise.

Write product update announcements in our brand voice.

Example 1:
Feature: Team permissions
Announcement: "You asked, we shipped. Team permissions are live —
set viewer, editor, or admin roles in Settings → Team. No more
sharing your login credentials (we saw you doing that)."

Example 2:
Feature: API rate limit increase
Announcement: "Rate limits just went from 60 to 200 requests/minute.
If you were batching requests to stay under the limit, you can
stop now. Just… send them."

Now write:
Feature: Dark mode
Announcement:

The AI picks up the informal tone, the short sentences, the parenthetical asides, and the slight humor — all from two examples.

2. Data Extraction

Pull structured data from unstructured text.

Extract company information from these descriptions.

Text: "Founded in 2015, Acme Corp (San Francisco) builds project
management tools. They raised $45M Series B in January 2024
and have around 200 employees."
Result:
- Company: Acme Corp
- Founded: 2015
- Location: San Francisco
- Product: Project management tools
- Funding: $45M Series B (Jan 2024)
- Size: ~200 employees

Text: "Berlin-based Loom.ai launched in 2021 with a $12M seed
round. The 35-person team builds AI video editing software."
Result:
- Company: Loom.ai
- Founded: 2021
- Location: Berlin
- Product: AI video editing software
- Funding: $12M Seed
- Size: ~35 employees

Text: "Watershed, a carbon accounting platform out of San Francisco,
just closed a $100M Series C. The company was founded in 2019
and now has 350 people."
Result:

Without examples, the AI might use different field names, different formatting, or include/exclude different information. With examples, it matches exactly.

3. Code Transformation

Convert code from one pattern to another.

Convert these class components to React hooks.

Before:
class UserProfile extends React.Component {
  state = { user: null, loading: true }

  componentDidMount() {
    fetchUser(this.props.id).then(user =>
      this.setState({ user, loading: false })
    )
  }

  render() {
    if (this.state.loading) return <Spinner />
    return <div>{this.state.user.name}</div>
  }
}

After:
function UserProfile({ id }) {
  const [user, setUser] = useState(null)
  const [loading, setLoading] = useState(true)

  useEffect(() => {
    fetchUser(id).then(user => {
      setUser(user)
      setLoading(false)
    })
  }, [id])

  if (loading) return <Spinner />
  return <div>{user.name}</div>
}

Now convert this:
class OrderList extends React.Component {
  state = { orders: [], error: null }

  componentDidMount() {
    fetchOrders()
      .then(orders => this.setState({ orders }))
      .catch(err => this.setState({ error: err.message }))
  }

  render() {
    if (this.state.error) return <Error message={this.state.error} />
    return <ul>{this.state.orders.map(o => <li key={o.id}>{o.name}</li>)}</ul>
  }
}

After:

One example establishes the exact conversion pattern: state → useState, componentDidMount → useEffect, destructured props, same variable naming style.

4. Email Response Generation

Maintain consistent customer support tone.

Write customer support replies in our style.

Customer: "I was charged twice for my subscription this month."
Reply: "That shouldn't have happened — sorry about the double charge.
I've refunded the duplicate payment ($14.99) to your card. It'll
show up in 3-5 business days. If it doesn't appear by Friday,
let me know and I'll escalate it."

Customer: "Your app crashed and I lost 2 hours of work."
Reply: "That's really frustrating, and I'm sorry you lost work.
We identified the crash — it was related to autosave failing on
files over 50MB. We shipped a fix this morning. Your recent files
should be recoverable: go to File → Version History → Restore.
If anything's missing, send me the file name and I'll check our
backup logs."

Customer: "I've been waiting 3 days for a response to my ticket."
Reply:

The AI learns: acknowledge the issue, apologize without being excessive, give the concrete fix, provide a specific next step. No corporate template language, no "we value your business."

5. Content Summarization

Control summary length, format, and focus.

Summarize articles for our weekly developer newsletter.

Article: [500-word article about Rust's new async features]
Summary: "Rust 1.75 ships with async fn in traits — no more
workaround crates. Migration is straightforward for most codebases:
swap the async-trait macro for native syntax. Performance benchmarks
show 10-15% improvement in async-heavy services."

Article: [800-word article about GitHub's AI code review]
Summary: "GitHub Copilot now reviews PRs automatically. It catches
style violations and potential bugs, but won't block merges —
suggestions only. Early reports: useful for catching obvious issues,
not ready to replace human reviewers on complex changes."

Article: [your article to summarize]
Summary:

The examples set the exact length (2-3 sentences), the technical level, and the editorial voice (opinionated, not neutral).

6. SQL Query Generation

Generate consistent query patterns from natural language.

Convert these questions to PostgreSQL queries.
Schema: users (id, email, created_at, plan),
orders (id, user_id, amount, status, created_at)

Question: "How many paid users signed up last month?"
Query:
SELECT COUNT(*)
FROM users
WHERE plan != 'free'
  AND created_at >= date_trunc('month', CURRENT_DATE - interval '1 month')
  AND created_at < date_trunc('month', CURRENT_DATE);

Question: "What's the total revenue from completed orders this year?"
Query:
SELECT SUM(amount) as total_revenue
FROM orders
WHERE status = 'completed'
  AND created_at >= date_trunc('year', CURRENT_DATE);

Question: "Which users have placed more than 5 orders?"
Query:

The examples establish conventions: date handling style, column aliases, WHERE clause formatting. The AI follows the pattern exactly instead of using its own preferred SQL style.

7. Feedback and Review Writing

Generate constructive, specific feedback.

Write code review comments that are specific, constructive,
and suggest the fix.

Code: if (user != null && user.email != null && user.email != "")
Comment: "Consider using optional chaining and a trim check:
`if (user?.email?.trim())`. Handles null, undefined, and
whitespace-only strings in one expression."

Code: for (let i = 0; i < arr.length; i++) { results.push(transform(arr[i])) }
Comment: "This is a map operation: `const results = arr.map(transform)`.
More readable, avoids manual index management, and communicates
intent better."

Code: catch (e) { console.log(e); return null; }
Comment:

The AI learns the review style: identify the issue, explain why it matters, show the better version. No vague "this could be improved" comments.

8. Product Description Writing

Consistent e-commerce or SaaS descriptions.

Write product descriptions for our developer tool marketplace.

Tool: Prettier — Opinionated code formatter
Description: "Stop arguing about code style. Prettier formats
your code automatically on save — tabs vs spaces, semicolons,
quote style — all decided for you. Supports JS, TS, CSS, HTML,
JSON, and more. Set it up once, never think about formatting again."

Tool: ESLint — Pluggable JavaScript linter
Description: "Catch bugs before they ship. ESLint analyzes your
JavaScript for problems — unused variables, missing error handling,
accessibility violations — and fixes most of them automatically.
Fully configurable: start with a preset, adjust the rules that
matter to your team."

Tool: Husky — Git hooks made easy
Description:

Two examples establish the formula: bold opening line stating the core benefit, feature details in the middle, practical closing statement. Same length, same energy, same structure.

Common Mistakes and How to Fix Them

Mistake 1: Examples That Are Too Similar

Bad — all examples are positive sentiment:
"Great product!" → Positive
"Love this app!" → Positive
"Amazing service!" → Positive

"The worst experience ever" → ???

The model has only seen positive examples. It might classify everything as positive.

Fix: Include examples that cover the full range of expected outputs — positive, negative, neutral, mixed.

Mistake 2: Inconsistent Format Between Examples

Bad — format keeps changing:
Input: "Paris" → Country: France, Continent: Europe
Input: "Tokyo" → Japan (Asia)
Input: "Cairo" → The country is Egypt and it's in Africa

The AI doesn't know which format to follow.

Fix: Every example should follow the exact same structure. Copy-paste the format template, then fill in different content.

Mistake 3: Too Many Examples Eating Your Context

If you're using 15 examples and each is 200 words, that's 3,000 tokens before you even get to your actual request. With limited context windows (or limited attention), your real input gets squeezed.

Fix: Keep examples minimal — just enough to show the pattern. If an example needs 200 words, your task might be too complex for few-shot alone. Combine with explicit instructions instead.

Mistake 4: Examples Don't Match the Real Task's Difficulty

Bad — examples are trivial, real task is complex:
"Hello" → "Hola"
"Goodbye" → "Adiós"

Now translate: [500-word legal document]

Simple examples don't prepare the model for complex inputs.

Fix: Match example complexity to real-task complexity. If you'll be translating paragraphs, show at least one paragraph-length example.

Few-Shot + Other Techniques

Few-shot prompting combines well with other methods:

Few-Shot + Chain of Thought

Show the reasoning process in your examples, not just the answer:

Question: "A store had 45 apples. They sold 60% on Monday and
half the remainder on Tuesday. How many are left?"

Reasoning:
- Monday: 60% of 45 = 27 sold, 45 - 27 = 18 remaining
- Tuesday: half of 18 = 9 sold, 18 - 9 = 9 remaining
Answer: 9 apples

Question: [new problem]
Reasoning:

This is called Few-Shot CoT — it produces more accurate results than either technique alone on reasoning-heavy tasks. See our deep-dive on chain-of-thought prompting for more on structuring reasoning steps.

Few-Shot + STOKE

The Examples component of the STOKE framework is essentially few-shot prompting embedded within a larger structure. Use STOKE when you need context (Situation), success criteria (Objective), and domain knowledge (Knowledge) alongside your examples.

Use few-shot alone when examples are sufficient — typically for pattern-matching tasks like classification, extraction, and format conversion.

Few-Shot + Role Prompting

You are a senior tax accountant reviewing expense reports.

Example 1:
Expense: "Team dinner at Per Se — $4,200 for 6 people"
Review: "FLAG — Per-person cost of $700 exceeds the $150/person
policy limit. Requires VP approval. Suggest splitting across
entertainment and client relations if client was present."

Example 2: ...

New expense to review:

The role sets the expertise level; the examples set the review format and strictness threshold.

How Promplify Uses Few-Shot

When you optimize a prompt with Promplify, the engine detects tasks where few-shot examples would improve output quality — particularly classification, formatting, and style-matching tasks.

For these, the optimizer:

Identifies the pattern type — Is this extraction, classification, transformation, or generation?
Generates relevant examples — Based on your prompt's domain and task, not generic boilerplate
Calibrates example count — Simple patterns get 1-2 examples, complex ones get 3-5
Combines with other frameworks — Adds Chain of Thought reasoning to examples when the task involves multi-step logic

The result: your prompt gets the examples it needs without you having to write them.

Quick Reference

Scenario	Few-Shot Needed?	How Many?
Text classification	Yes	2-3 (one per category)
Format conversion	Yes	1-2
Tone/voice matching	Yes	2-3
Data extraction	Yes	2
Simple Q&A	No	0
Creative writing	Maybe	1 (style reference only)
Math/reasoning	Yes (with CoT)	1-2
Code generation	Sometimes	1 (if specific pattern needed)

Key Takeaways

Few-shot prompting means showing the AI examples of what you want before asking it to produce new output
2–5 diverse examples hit the sweet spot — more rarely helps, fewer is risky
Examples communicate tone, format, and quality level more reliably than instructions — see our guide on how to write better AI prompts for more on structuring effective instructions
Diversity matters more than quantity — cover the range of expected inputs
Keep examples consistent in format — the AI copies your structure exactly
Combine with Chain of Thought for reasoning tasks, STOKE for complex tasks with many requirements
Match example complexity to your actual task complexity

Want few-shot examples added to your prompts automatically — tuned for your specific task type? Try Promplify free. The optimizer detects when examples will improve output quality and generates relevant ones, so you get better results without writing examples from scratch.

Ready to Optimize Your Prompts?

Try Promplify free — paste any prompt and get an AI-rewritten, framework-optimized version in seconds.

Start Optimizing