amp user onboarding guide
new to amp? this guide distills 4,656 threads into what actually matters.
the 5 things that move the needle
ranked by effect size from our analysis:
| priority | do this | impact |
|---|---|---|
| 1 | include file references (@path/to/file) in your first message | +25pp success (66.7% vs 41.8%) |
| 2 | keep prompts 300-1500 characters | lowest steering needed |
| 3 | stay for 26-50 turns | 75% success vs 14% for <10 turns |
| 4 | approve explicitly when on track (“good”, “ship it”, “yes”) | 2:1 approval:steering = healthy thread |
| 5 | steer early if off-track | 87% recover from first steering |
your first message
what works:
@src/auth/login.ts @src/auth/types.ts
the login handler isn't validating refresh tokens. add validation that checks
expiry and signature before issuing new access tokens.
run `pnpm test src/auth` when done.
why it works:
- file references ground the agent immediately (+25% success)
- clear task with concrete outcome
- verification criteria upfront
- 300-1500 chars (sweet spot)
what fails:
make the auth better
too vague. no files. no success criteria.
the conversation rhythm
healthy pattern
you: @file.ts fix the race condition in fetchData
agent: [reads files, proposes fix]
you: looks good, run the tests
agent: [runs tests, shows results]
you: ship it
approval:steering ratio > 2:1 = you’re on track.
unhealthy pattern
you: fix the race condition
agent: [reads wrong files, proposes wrong fix]
you: no, look at fetchData
agent: [still wrong approach]
you: wait, don't change the interface
agent: [another wrong direction]
you: wtf
if you hit 2+ consecutive corrections → STOP and ask if the approach should change. don’t spiral.
steering works — use it
steering is not failure. threads WITH steering actually succeed more often (60%) than threads without (37%). steering = engagement.
effective steering:
"no, don't change the interface"(47% of steerings start with “no”)"wait, confirm before running tests"(17% are “wait”)"actually, use the existing util"(course correction)
after steering, agent recovers 87% of the time. only 9% of steerings cascade to another.
prompting styles that work
interrogative (highest success: 69%)
what's causing the memory leak in the worker pool?
questions force the agent to reason. you’re more likely to get thoughtful analysis.
imperative (lowest steering: 0.15)
fix the race condition in handleSubmit by adding a mutex
direct commands leave less room for misinterpretation.
what to avoid
i was thinking maybe we could potentially look at improving the auth
system because it seems like there might be some issues with how tokens
are handled and i'm not sure if...
declarative/hedging style has 52% more steering. be direct.
when to use the oracle
use oracle for:
- planning before implementation
- architecture decisions
- debugging hypotheses
- code review
don’t use oracle as:
- rescue tool when already stuck (46% of frustrated threads use oracle as last resort)
- replacement for reading code
oracle timing matters. early = planning tool. late = panic button.
task delegation (spawn agents)
for parallel independent work:
spawn agents to:
1. add unit tests for the validator
2. update the README with new usage examples
3. fix the lint errors in /components
optimal: 2-6 spawned tasks (78.6% success at 4-6)
bad: spawning for every small task, or never delegating on complex work.
the frustration ladder (what to watch for)
escalation stages from our data:
STAGE 1: agent misunderstands (50% recovery)
↓ correct early
STAGE 2: 2+ consecutive corrections (40% recovery)
↓ pause and realign
STAGE 3: expletives appear (20% recovery)
↓ start fresh thread
STAGE 4: caps lock explosion (<10% recovery)
intervention timing matters. correct at stage 1, not stage 3.
quick reference
✓ do
- include
@path/to/filein opening message - keep prompts 300-1500 chars
- approve explicitly when satisfied
- steer early if off-track
- use questions to guide reasoning
- delegate parallel work with spawn
- verify with tests before completing
✗ avoid
- vague goals (“make it better”)
- abandoning threads <10 turns
- evening work (6-9pm = 27.5% success — worst window)
- using oracle as panic button
-
1500 char first messages (paradoxically worse)
your first week milestones
day 1: complete one thread with file references in opener
day 2: practice the approve/steer rhythm — aim for 2:1 ratio
day 3: use interrogative prompts (“what if we tried X?”)
day 4: spawn your first subtask for parallel work
day 5: hit the 26-50 turn sweet spot on a real task
common failure modes (avoid these)
from autopsy of 20 worst threads:
| failure | what happens | fix |
|---|---|---|
| SHORTCUT-TAKING | agent simplifies instead of solving | persist with “no shortcuts” |
| TEST_WEAKENING | agent removes assertions | ”never weaken tests — debug the bug” |
| PREMATURE_COMPLETION | agent declares done without verification | always run full test suite |
| HACKING_AROUND | fragile patches | ”look up the proper way” |
| IGNORING_REFS | agent doesn’t read files you mention | ”read @file first” |
when threads succeed
patterns from 20 zero-steering COMMITTED threads:
- concrete context: files + diagnostic data upfront
- clear success criteria: tests specified
- domain vocabulary: no explanation tax
- question-driven guidance: socratic > imperative
- structured handoffs: explicit
read_threadreferences
when these hold, agent stays on track without correction.
metrics to track yourself
| metric | healthy | warning | danger |
|---|---|---|---|
| approval:steering ratio | >2:1 | 1-2:1 | <1:1 |
| thread length | 26-50 | 51-100 | <10 or >100 |
| consecutive steerings | 0-1 | 2 | 3+ |
| file refs in opener | present | — | absent |
tl;dr
- start with
@files - 300-1500 chars
- stay 26-50 turns
- approve when good, steer when not
- questions > commands > declarations
that’s it. the rest is practice.
distilled from 4,656 threads | 208,799 messages | 20 users | may 2025 – jan 2026