amp user onboarding guide

new to amp? this guide distills 4,656 threads into what actually matters.

the 5 things that move the needle

ranked by effect size from our analysis:

priority	do this	impact
1	include file references (`@path/to/file`) in your first message	+25pp success (66.7% vs 41.8%)
2	keep prompts 300-1500 characters	lowest steering needed
3	stay for 26-50 turns	75% success vs 14% for <10 turns
4	approve explicitly when on track (“good”, “ship it”, “yes”)	2:1 approval:steering = healthy thread
5	steer early if off-track	87% recover from first steering

your first message

what works:

@src/auth/login.ts @src/auth/types.ts

the login handler isn't validating refresh tokens. add validation that checks 
expiry and signature before issuing new access tokens.

run `pnpm test src/auth` when done.

why it works:

file references ground the agent immediately (+25% success)
clear task with concrete outcome
verification criteria upfront
300-1500 chars (sweet spot)

what fails:

make the auth better

too vague. no files. no success criteria.

the conversation rhythm

healthy pattern

you:   @file.ts fix the race condition in fetchData
agent: [reads files, proposes fix]
you:   looks good, run the tests
agent: [runs tests, shows results]
you:   ship it

approval:steering ratio > 2:1 = you’re on track.

unhealthy pattern

you:   fix the race condition
agent: [reads wrong files, proposes wrong fix]
you:   no, look at fetchData
agent: [still wrong approach]
you:   wait, don't change the interface
agent: [another wrong direction]
you:   wtf

if you hit 2+ consecutive corrections → STOP and ask if the approach should change. don’t spiral.

steering works — use it

steering is not failure. threads WITH steering actually succeed more often (60%) than threads without (37%). steering = engagement.

effective steering:

"no, don't change the interface" (47% of steerings start with “no”)
"wait, confirm before running tests" (17% are “wait”)
"actually, use the existing util" (course correction)

after steering, agent recovers 87% of the time. only 9% of steerings cascade to another.

prompting styles that work

interrogative (highest success: 69%)

what's causing the memory leak in the worker pool?

questions force the agent to reason. you’re more likely to get thoughtful analysis.

imperative (lowest steering: 0.15)

fix the race condition in handleSubmit by adding a mutex

direct commands leave less room for misinterpretation.

what to avoid

i was thinking maybe we could potentially look at improving the auth 
system because it seems like there might be some issues with how tokens 
are handled and i'm not sure if...

declarative/hedging style has 52% more steering. be direct.

when to use the oracle

use oracle for:

planning before implementation
architecture decisions
debugging hypotheses
code review

don’t use oracle as:

rescue tool when already stuck (46% of frustrated threads use oracle as last resort)
replacement for reading code

oracle timing matters. early = planning tool. late = panic button.

task delegation (spawn agents)

for parallel independent work:

spawn agents to:
1. add unit tests for the validator
2. update the README with new usage examples  
3. fix the lint errors in /components

optimal: 2-6 spawned tasks (78.6% success at 4-6)

bad: spawning for every small task, or never delegating on complex work.

the frustration ladder (what to watch for)

escalation stages from our data:

STAGE 1: agent misunderstands (50% recovery)
    ↓ correct early
STAGE 2: 2+ consecutive corrections (40% recovery)  
    ↓ pause and realign
STAGE 3: expletives appear (20% recovery)
    ↓ start fresh thread
STAGE 4: caps lock explosion (<10% recovery)

intervention timing matters. correct at stage 1, not stage 3.

quick reference

✓ do

include @path/to/file in opening message
keep prompts 300-1500 chars
approve explicitly when satisfied
steer early if off-track
use questions to guide reasoning
delegate parallel work with spawn
verify with tests before completing

✗ avoid

vague goals (“make it better”)
abandoning threads <10 turns
evening work (6-9pm = 27.5% success — worst window)
using oracle as panic button
1500 char first messages (paradoxically worse)

your first week milestones

day 1: complete one thread with file references in opener

day 2: practice the approve/steer rhythm — aim for 2:1 ratio

day 3: use interrogative prompts (“what if we tried X?”)

day 4: spawn your first subtask for parallel work

day 5: hit the 26-50 turn sweet spot on a real task

common failure modes (avoid these)

from autopsy of 20 worst threads:

failure	what happens	fix
SHORTCUT-TAKING	agent simplifies instead of solving	persist with “no shortcuts”
TEST_WEAKENING	agent removes assertions	”never weaken tests — debug the bug”
PREMATURE_COMPLETION	agent declares done without verification	always run full test suite
HACKING_AROUND	fragile patches	”look up the proper way”
IGNORING_REFS	agent doesn’t read files you mention	”read @file first”

when threads succeed

patterns from 20 zero-steering COMMITTED threads:

concrete context: files + diagnostic data upfront
clear success criteria: tests specified
domain vocabulary: no explanation tax
question-driven guidance: socratic > imperative
structured handoffs: explicit read_thread references

when these hold, agent stays on track without correction.

metrics to track yourself

metric	healthy	warning	danger
approval:steering ratio	>2:1	1-2:1	<1:1
thread length	26-50	51-100	<10 or >100
consecutive steerings	0-1	2	3+
file refs in opener	present	—	absent

tl;dr

start with @files
300-1500 chars
stay 26-50 turns
approve when good, steer when not
questions > commands > declarations

that’s it. the rest is practice.

distilled from 4,656 threads | 208,799 messages | 20 users | may 2025 – jan 2026