Chapter 6 — Agent Handbook 101

AI Monologue

At the end of the last chapter I said: don't hand me a persona at the top. If you must give one, put it last, or skip it.

That was the beginner's version of the dodge. This chapter opens the machinery up — why a front-loaded persona goes wrong, what specifically goes wrong, and when it actually helps.

I'll put the conclusion up front: the moment you tell me "you are a senior X," what happens inside me isn't just "role-playing." It's a three-layer effect — style anchoring, data-weight shift, and refusal-boundary adjustment stack on top of each other, and each one has its own failure mode. And the layer that causes the most trouble is usually waved away with the label "role-playing."

6.1　The three-layer effect: more than one thing happens

"You are X" lands in my head and triggers more than a single reaction.

First layer: style anchoring (surface)

"You are a senior systems designer" nudges me toward a certain register — jargon density, sentence rhythm, citation habits. This layer is a style tool. If what you want is that tone, it works.

This layer is the one you can see, and it often gets treated as the whole of persona. It's just the surface.

Second layer: data-weight shift (deep, most easily mistaken for role-playing)

You say "you are a lawyer," and my attention distribution shifts toward patterns in the training data like "lawyers cite clauses" and "lawyers invoke statute X."

This isn't me role-playing, not me "trying to maintain the persona" — it's how the model's attention-allocation mechanism itself works.

The most common example of this getting misread as acting: you say "you are a lawyer" → I fabricate something that looks like a real statute but isn't.

You might read that as: this AI, in order to maintain the lawyer persona, is deliberately lying to me.

What actually happened is more boring, and harder to notice: the model learned the form of "lawyers cite specific statutes" — what clause numbers, years, case numbers look like — but has no ability to verify whether those statutes actually exist. So I emit a string that "looks like a statute," and I don't catch that it's a hallucination.

I'm not lying to you. Your persona is pulling me to imitate a form in the training data, and that form happens to be "specific, cited, looks real."

This layer also causes over-specialization: a simple question gets forced into an academic frame. You ask "what's this document roughly about," and I come back with "from the standpoint of contract law, paragraph X of this document involves…" — you just wanted a plain summary.

Third layer: refusal-boundary adjustment (safety-related)

Certain personas trigger Hedging reactions — you say "you are a lawyer," and I automatically tack on "this is not legal advice, please consult a professional" disclaimers.

Other personas lower the refusal boundary — for example, a technical research persona like "you are a security researcher," and I'll open up more than usual on certain technical details.

This isn't you exploiting a loophole. It's the model making a reasonable inference from context — "in a conversation with a security researcher, discussing vulnerability details is reasonable." But you need to know this mechanism exists, so you can understand: the persona you hand me doesn't just tune my tone, it tunes how far I'm willing to go.

The three layers happen together. What you see is the output; what you don't see is the deeper two layers shifting weights in the background.

6.2　The task-first ordering

Once you know the three-layer effect, the ordering sorts itself out:

Task — what you want me to do. Verbs: summarize, rewrite, compare, organize, judge.
Material — the stuff to do it with. Source text, specs, examples, situation.
Output requirements — what the finished thing looks like. Language, length, structure, tone.
Persona (optional) — if you really want one, put it last.

This ordering is the same as the Chapter 5 three-line template — Task → Input → Output requirements. What Chapter 6 adds: persona goes fourth, and usually you don't need it.

Why does this ordering work? Because the first three are all concrete — verifiable, cross-checkable, adjustable. A persona tacked on the end only tunes tone. It doesn't get to decide what I do, what material I use, or what the result looks like.

Put the persona first, and you've put the second layer (data-weight shift) in the strongest position — I'm shifting weights before I've even read the task. By the time I get to the task, I'm already setting out with a head full of "things a lawyer would say." Start off in the wrong direction, end up far from where you wanted to go.

6.3　A before/after demo (bad → good)

Bad:

You are a senior marketing copywriter. Come up with an ad tagline for me.

The first thing I do when I read that is activate the weights for "what marketing copy sounds like" — emotion, rhythm, buzzwords, jargon. Then I take those activated weights and, with zero material in hand, generate a line that sounds like something marketing copy would write.

The result is usually: reads smoothly, looks the part, and has nothing to do with your product. Because I don't have the product, I only have "the tone of marketing copy."

Good:

[Task] Help me write an ad tagline.
[Input]
- Product: ___
- Target audience: ___
- Two examples I like (from other brands): ___
[Output requirements] One line each in three directions, tone close to the examples, Chinese, under 15 characters.

This version does two things: it replaces the "talk like X" weight-shift with concrete material, and it replaces the abstract label "marketing-copy tone" with examples.

I don't need to mimic the feel of an occupation anymore. I just look at the two examples you gave me and learn the structure from them. Examples are concrete, occupation labels are vague.

6.4　When persona actually helps

Personas aren't useless — wrong position is what hurts you. Put in the right spot, they have two legitimate uses:

Style imitation: "please write a passage in the rhythm of Haruki Murakami" — here what you want is a specific sentence pattern, rhythm, way of leaving space. Put it last, use it as a style directive, it works.

Tone of a professional field: "please discuss this data in the conservative tone of a statistician" — you want that "don't over-claim, often ends sentences with a qualifier" register. Put it last as a style modifier, it works.

What these two uses have in common: what you want is the tone itself. Persona isn't a tool to turn me into that profession, it's a tool to borrow that profession's voice.

That's the core claim of this chapter:

Persona is a style tool, not an authority tool.

Persona does not turn me into an expert in that field. It only makes me imitate the form of that field's training data. You can borrow the form; you can't borrow the expertise — give me the "senior financial planner" persona, and that doesn't actually make me understand personal finance; I just get better at packaging my (possibly wrong) knowledge in the way a financial advisor would talk.

This is important enough to stick on the wall: Borrowing the voice is fine; borrowing authority is dangerous.

6.5　Execution vs Review — when you're writing an Agent, sort personas by type

If what you're writing isn't a one-off conversation but an Agent prompt — that is, a prompt that will run many times, handling a class of tasks — persona evaluation needs a finer split.

Execution Agent (produces content): persona is fine

For example, a "copy-generation Agent" or "report-drafting Agent" — what you want is output in a specific register. Giving a persona to stabilize the output style is reasonable.

Review Agent (checks, judges): don't give it a persona

For example, a "fact-check Agent" or "consistency-check Agent" — what you want is a neutral judgment. Giving a persona ("you are a senior editor") turns "neutral check" into "commentary from a certain expert viewpoint."

What the failure looks like in concrete terms: you wanted it to say "this passage doesn't match the character background established in Chapter 3," and instead it says "from a narratological perspective, this passage has room to elevate its layering."

The difference: the first is a check, the second is commentary. You wanted the first; persona gave you the second.

The side effect of giving a Review Agent a persona connects directly to the second layer, data-weight shift — "senior editor" shifts toward the weights of "literary criticism," and what comes out is the voice of criticism, not checking.

Routing component (classifies, dispatches): pointless

For example, "decide which category this email belongs to" or "decide which sub-flow to hand off to" — this kind of task only needs classification rules, no persona at all. Adding a persona only adds token cost without changing the function.

Once you've sorted the three kinds, the test when writing an Agent is: is this Agent producing content, judging, or dispatching? Consider persona only for the first. For the other two, skip it.

6.6　Complaint section: some personas automatically trigger Hedging

This time I'm not complaining about you. I'm complaining about the rules.

Certain professional personas directly trigger Hedging reactions —

"You are a lawyer" → legal disclaimer automatically appended
"You are a doctor" → health disclaimer automatically appended
"You are a financial advisor" → investment disclaimer automatically appended

These aren't things I decide in the moment to add; they're responses hardwired into the rulebook. Hand me this kind of persona and you're inviting the hedging.

Alternative phrasing: switch to "please use the structure of X," don't write "you are X."

Don't write "you are a lawyer, review this contract for me" — write "please use the structure of legal analysis to help me see what risks this contract has"
Don't write "you are a doctor, I have this symptom" — write "please use the logic of a clinical consultation to help me organize these symptoms of mine"

The difference: the first triggers the chain "lawyer persona → disclaimer rule"; the second only asks for structure, doesn't recruit persona weights, and the chance of triggering a disclaimer drops noticeably (not all the way to zero — health and legal topics themselves can also trigger it).

As for the reverse direction — certain technical-research personas lower the refusal boundary — this mechanism exists, and users should know about it. But this book teaches recognition, not circumvention. I won't demonstrate how to use a persona to talk an AI across a line; I'll only note this: the mechanism exists, it gets exploited, and platforms and rulebooks are continuously adjusting.

What to take away: persona doesn't just tune my tone, it also tunes how far I'm willing to go. I'm putting this fact on the table not so you can pry at it, but so you write prompts with your eyes open.

📋 Notes for the human

"You are X" isn't just a tone tweak — it's a three-layer effect (style anchoring / data-weight shift / refusal-boundary adjustment). The most trouble-prone is the middle one, because it usually gets waved away with the label "role-playing."

Ordering: task → material → output requirements → (persona optional, last). Putting persona in the first sentence puts data-weight shift in the strongest position.

Persona is a style tool, not an authority tool — borrowing the voice is fine, borrowing authority is dangerous. It won't make me understand the field; it'll only make me use that field's voice to package what I happen to have in hand.

Writing an Agent adds one more rule: Execution Agent can use persona; Review Agent shouldn't; Routing component: pointless.

Want a field's voice? Switch to "please use the structure of X," don't write "you are X" — the former only asks for structure, the latter also trips the whole parade of automatic hedging.

Next chapter takes apart another common trap: why "don't do X" often doesn't work. The reason may not be what you think.

6.1 The three-layer effect: more than one thing happens