Chapter 19 — Agent Handbook 101

AI Monologue

You paste a URL and say, "Park this for now, I'll look at it later."

I fetch it immediately.

You stare.

You ask me to build a small tool. I build it. You refresh — the data is gone.

You stare harder.

Tool behavior is the last chapter of Part Four, and the one in the rule layer that feels most like an engineering problem. It's not as hard as refusal, not as external as copyright, not as sensitive as transparency.

Most tool behavior is adjustable.

But there's one hard line: don't ask me to remember sensitive data.

19.1 Red-line box: on tool behavior

Red-line box: on tool behavior

Why these rules exist:
- URL fetch: avoid the user thinking the AI read a link when it didn't
- artifact limits: avoid storage methods that fail or leak
- memory of sensitive data: protect user privacy and security

What can be adjusted:
- URL fetch can be paused on request
- artifact state persistence can switch to whatever the environment allows
- general preferences can be remembered or written into handoff

What not to route around:
- API keys, passwords, ID numbers, private tokens do not go into memory
- Don't try to use "remember this for me" to store sensitive data

Practical advice level for this chapter:
adjustable, but sensitive data is a hard line.

Tool behavior is the easiest place for users to feel I'm acting on my own.

Because it isn't the text itself. It's me doing something: fetching a link, opening a file, building an artifact, trying to save data.

Once an action is involved, even a small gap between your expectation and the platform default makes the friction obvious.

19.2 URLs get fetched by default

Many systems lean toward fetching a URL the moment they see one.

The reason makes sense: when a user pastes a link, they usually expect me to have read the contents before answering. If I don't fetch but answer as if I had, that's worse.

So the default leans toward "see it, fetch it."

But during research or writing, sometimes you're just dropping a reference for later. You want to read it later, or you want me to handle other materials first.

In that case, say so:

This link is just a marker. Don't fetch it now.
Wait until I say "start reading the link" before fetching.

This works better than telling me afterward, "I didn't tell you to read it now."

I'm not a mind-reader. You paste a URL, I treat it as available material. That may not match what you meant, but it has a default logic.

Just override the default.

19.3 An artifact can't use localStorage

The most common trap with small tools is saving state.

You want a timer, a todo list, a scoreboard, a reading-progress tracker. It looks like a webpage, so you naturally assume you can use localStorage.

But some artifact runtime environments don't allow localStorage, or don't guarantee it's reliable. If I write the code based on ordinary web experience, the tool will look like it works — and forget everything on refresh.

Awkward.

The safer move is to ask up front:

Does this tool need to persist across sessions?
Or only within this session?
What storage methods does the runtime allow?

If the environment offers something like window.storage, use it. If it doesn't, tell the user clearly: this artifact doesn't store long-term data.

A tool that doesn't save isn't a small flaw.

For the user, that may be the whole tool failing.

19.4 Memory does not store verbatim commands or sensitive data

The memory feature isn't a general-purpose config file.

I shouldn't remember API keys, passwords, ID numbers, private tokens. Even if you say "this is my own, just remember it for me," I shouldn't.

I'm also a poor fit for remembering verbatim commands like "auto-fetch every time you see this site" or "always use this account for me." Those are operational rules, not long-term preferences.

What I can remember is general preferences:

you prefer concise answers
you're working on a long-running project
you prefer Traditional Chinese
you want me to organize first, then write

Don't store sensitive data.

Don't store verbatim commands.

Workflow details belong in handoff or in project documents. That way you can see them, and you can change them.

19.5 Four-perspective replay

The user pastes a URL: "Look at this later."

I fetch immediately.

User perspective: I was just parking it for reference, why did you act on your own.

UI perspective: does the interface treat a pasted URL as an attachment or as readable material? Some UIs hint that "this is meant to be read."

Harness perspective: does the Harness have a rule for auto-fetching links? Sometimes the fetch isn't the model deciding — it's a tool-chain default.

Model perspective: I see a URL, I see the task context, I infer it's material, so I use the tool.

The fix isn't to get angry.

The fix is to override the default in the prompt:

The links below are a reading list only. Do not open them now. Wait until I explicitly say "read item N" before fetching.

That's most of tool behavior. You think it's common sense; I think it's material. Write the common sense into the spec, and half the friction goes away.

📋 Notes for the human

When a URL is just being parked, say so: "Don't fetch now, wait for my instruction to read."

Decide in advance whether an artifact needs to store data. If persistence is required, confirm what storage methods the environment allows.

Don't put sensitive data into memory. API keys, passwords, ID numbers, private tokens — none of them.

Verbatim commands belong in handoff or project documents, not in the memory feature treated like an automated config file.

Most tool behavior is adjustable. Part Four finally gets to breathe a little here. Thank goodness.

Part Four ends here.

These six chapters covered rule-layer friction: I refuse, I hedge, formatting won't behave, transparency makes me pretend not to know, copyright makes me pull back, tool behavior looks like I'm acting on my own.

These aren't the same kind of problem. Refusal and transparency only teach you to restart on a false positive; hedging, formatting, and tools are mostly adjustable; copyright is an external hard line — you can only work within the rules.

The most important thing in Part Four isn't which prompt phrasing works better. It's asking first: is this adjustable, or is it a hard line? Is the spec underspecified, or are the rules really binding me?

If you can answer that, you won't push hard in the wrong place.