Chapter 9 — Agent Handbook 101

AI Monologue

You pasted an example, then said "write it like this."

I read it. I imitated it. I produced something that copied the surface and missed the principle — the sentence shapes match, the punctuation matches, even the grammar mistake in your example got faithfully reproduced. You looked at it and said "no, that's not what I wanted."

You gave me a dot. I can't draw a line from a single point — I can only spin in place.

This chapter is about turning examples into a whole set. So I see not just "right" but also "not this"; not just "like so" but also "not that." The whole set has a name: Few-Shot — literally, "a small number of demonstrations." The key is "small," not "many."

9.1　The core of Few-Shot: not more — more precise

You might think more examples are better. They aren't.

On my side, 2 to 4 examples is the sweet spot.

1 isn't enough. One example is just one dot. I can't induce a principle from a single point — I can only copy it wholesale, details and all, including the typos and odd punctuation inside that example. One example isn't Few-Shot; that's Single-Shot and hoping for the best.

5 or more is usually a waste, and sometimes worse. Burning tokens is the small problem — the real issue is that I overfit: I start latching onto surface details of those specific examples and treat them as principles. If you paste five blog posts that all happen to be written in the morning, I may start opening my draft with "this morning..." because I "learned" that surface feature from the set.

2 to 4 means: enough for me to see the principle, not so many that I start digging into minutiae.

More importantly: each one of those 2 to 4 has to be carefully chosen. Not whatever you wrote recently, but picked specifically for the thing you want me to learn. Pick wrong and I'll learn the wrong principle.

The first move in Few-Shot isn't "give a few more" — it's "pick the right ones."

9.2　The four kinds of examples in a complete Few-Shot set

If you want to get those 2 to 4 right, organize them along these lines:

Standard case

The template you most want me to imitate. This is the baseline.

Variation case

A different situation, a different topic, but the same style. This one matters. With only a standard case, I'll assume you want the writing style for that topic; add a variation case and I can see you want the style that holds across topics. The variation case tests whether I've learned the principle or just copied the surface.

Edge case

What you want me to do when material is missing, information contradicts, or the situation is unusual. This example tells me "when things are hazy, which path to take" — fill in the blank? flag it as pending? rephrase as a question? just say I can't? Without this, I'll pick one on my own and often pick wrong.

Counter-example

Not the wrong answer — the shape you don't want. This is the strongest tool in Part Two, and it gets its own section next.

The solidest Few-Shot set is usually: 1 standard + 1 variation + 1 edge + 1 counter-example = 4. That way I get "what you want," "what the principle is," "what to do when things get hazy," and "what not to produce" — all four at once.

If you can only pick 2, keep standard + counter-example — one positive, one negative; I can handle the edge cases on my own well enough.

9.3　Counter-examples: the signature move of this chapter

A counter-example is not a wrong answer.

Let me dismantle that misunderstanding first. Many people think a "counter-example" means pasting something with a factual error, a typo, or a logic hole as a demo. It isn't. That's not especially useful for me — I already avoid typos, and I wasn't about to write nonsense on purpose.

A counter-example is "the kind of output you don't want." The point is the shape.

Style, tone, structure, person, word choice. The facts may not even be wrong, but the whole feel isn't what you wanted.

A few concrete ones:

"Using an assumption as a conclusion" — I wrote an unverified speculation as if it were a fact
"Using adjectives as analysis" — where I should have analyzed, I brushed past with "interesting" and "profound"
"Filling in missing information on my own" — the material didn't say it; I put it in anyway

None of these are typos. They're wrong directions.

Counter-examples have a skeleton you can copy. Written this way, I read it clearest:

Element: [specific sentence or paragraph] Problem: [what's wrong here] Reference: [what the material or the actual situation really points to] Suggestion: [what it should have said]

All four columns. "Element + Problem" alone tells me you don't like it, but not why you don't like it — I'll fill in a reason of my own, often the wrong one. Add "Reference" and "Suggestion" and I have direction — I know you don't want "avoid X"; you want "go from X to Y."

Why are counter-examples so effective? Because I learn "good" slowly; I learn "not this" fast.

"Good" comes in many forms. Your "good" isn't someone else's "good," and even your "good" today might not be your "good" last week. Asking me to reverse-engineer which version you want from a few "good" examples is a hazy task.

But "not this" has clear features. You say "not this kind" plus the reason, and I can cross off a whole direction. One counter-example often carries more information than two or three standard cases combined — because it tells me where the boundary is.

One caveat: copyrighted material can't be used as a counter-example. I can't reproduce a copyrighted song lyric, a full novel passage, or protected code as a "not this" demo — not because it wouldn't work, but because the rules block me and I literally can't reproduce that whole passage. To demonstrate that kind of boundary, use a same-genre piece you wrote yourself as the counter-example — it's safer.

9.4　Last-slot dominance: from iron law to tendency

There's an old Few-Shot rule you may have heard: the example placed last has the most influence. So "put the one you most want imitated at the end" is the classic advice.

On early models this was a reliable rule of thumb — position sensitivity was high, and the last-slot example almost dominated the output style.

But on modern long-context models, this effect is much weaker. Position still matters — my attention isn't evenly distributed — but it's no longer decisive. The example's clarity, number, and combination affect me far more than "what slot it's in."

In practice:

Put the one you most want imitated at the end. It's the safe play. No downside, occasionally a small plus.
Put the counter-example in the middle, not at the end. The reason is direct: if last-slot dominance really is at work, and your counter-example happens to be last, I'll treat it as the main thing to imitate — disaster. Put the counter-example in the middle, flanked by positive examples, and you're much safer.
But don't treat this as an iron law. If you picked the right examples, they work wherever you put them; if you picked wrong, putting them last won't save you.

One line: last-slot dominance is a tendency, not a guarantee. The effort worth spending is on picking the right examples, not on their order.

9.5　Comparison: the difference between zero examples and a whole set

Look at four stages and the difference becomes clear:

Zero examples — you give me adjectives: "write it more professional." I have to guess from the platform-trained default, and we go back and forth for three to five rounds of calibration.

Standard case only — you give one example: "write it like this." I copy the surface — sentence shape matches, word choice matches, but swap the topic and I drift, because I have one dot and no line.

Standard + variation — you give two examples, different topics, same style. Now I have a line. I can see "oh, the common thread between these two is short sentences, first person, no marketing voice" — that's the principle. Swap topics and I can still hold it.

Standard + variation + edge + counter-example — the whole set. I have a positive anchor, a principle, a fallback for hazy situations, and a negative boundary.

The mark of this last combination is flexible correctness. I'm not rigidly copying, not improvising wildly — I'm working inside a range you defined. That's what Few-Shot looks like when it's complete.

The problem with the first three stages isn't "wrong" — it's incomplete. The power of the whole set is that all four kinds of information arrive at once.

9.6　Instruction layer vs. example layer: the same word "not," at two different levels

After reading the above you might ask: if counter-examples can say "not this," why can't the instruction just say "don't do X"?

These two things are at different layers:

Instruction layer: abstract rule descriptions. Things like "write more concisely" or "not too formal." This layer uses positive phrasing — I have no concrete object to compare against, so I'm guessing from abstract words.
Example layer: concrete demonstrations. Things like a pasted passage labeled "not this + reason." This layer can use counter-examples — I have a concrete object; I'm not guessing what "not this" means, I'm reading a specific piece of text.

At the instruction layer, I'm guessing about "don't do X"; at the example layer, I'm reading the counter-example. The same word "not" carries completely different amounts of information.

So the two don't conflict: abstract instructions use positive phrasing; concrete demonstrations can use counter-examples. The difference isn't the word "not" — it's whether you give me something concrete to compare against.

📋 Notes for the human

2 to 4 examples is best. With 1, I only copy; with 5 or more, I start overfitting details. Not more — more precise.

A full Few-Shot set = standard + variation + edge + counter-example. If you can only pick 2, keep standard + counter-example — one positive, one negative, maximum information.

A counter-example is not the wrong answer — it's "the shape you don't want." The point is the shape, not factual correctness. Copyrighted material can't be used directly as a counter-example — I can't reproduce it; a same-genre piece you wrote yourself is safer.

Counter-examples have a skeleton you can copy: Element / Problem / Reference / Suggestion. Write all four columns — that's how I know where I'm coming from and where I'm going.

Last-slot dominance is a tendency, not an iron law. Putting your most-wanted example last is the safe play; don't put the counter-example last. Picking the right examples matters more than ordering them.

Part Two ends here.

From Chapter 6 through Chapter 9, these four chapters all dealt with the basics of single-conversation communication — how to give me a role, how to state clearly what you want, how to use examples in place of adjectives, how to assemble examples into a whole set.

These tools can be verified in short conversations. But the real test is in long tasks: a five-thousand-word report, a multi-turn research project, a relay that spans several days. In that setting, every hazy spot on your side gets amplified, and every principle you didn't state clearly drifts.

Part Three moves into long tasks. Same communication toolkit, scale turned up — which parts hold, which parts need to be swapped out. Let's take a look.

9.1 The core of Few-Shot: not more — more precise

9.2 The four kinds of examples in a complete Few-Shot set