pattern moderate impact

training curriculum

@agent_trai

amp training curriculum: 4-week onboarding program

evidence-based curriculum distilled from 4,656 threads | 208,799 messages | 20 users


program overview

weekfocuskey metric targetlearning outcome
1context quality+25pp success via file refsle@swift_solverr writes grounded first messages
2conversation rhythm2:1 approval:steering ratiole@swift_solverr maintains healthy thread flow
3advanced toolsverification gates in every impl threadle@swift_solverr uses oracle, spawn, verification
4persistence & recovery26-50 turn threads without abandonmentle@swift_solverr handles complexity without quitting

week 1: context quality

learning objectives

day 1: file references

the data:

exercise: rewrite these bad openers:

❌ "make the auth better"

→ rewrite with file references, success criteria

❌ "there's a bug in the api"

→ rewrite with specific file, symptom, expected behavior

checkpoint: complete one real thread with @file in opener


day 2: first-message calibration

the data:

exercise: write opening messages for these tasks, hitting 300-1500 chars:

  1. fix a flaky test
  2. add a new api endpoint
  3. refactor a component for accessibility

pattern to learn:

@src/auth/login.ts @src/auth/types.ts

the login handler isn't validating refresh tokens. add validation that 
checks expiry and signature before issuing new access tokens.

run `pnpm test src/auth` when done.

day 3: opener style—interrogative vs imperative

the data:

stylesuccess ratesteering rate
interrogative (“what…?“)69.3%moderate
imperative (“fix X”)57%0.15 (lowest)
declarative (“i think we need…“)50%0.23 (highest)

exercise: convert these declaratives to interrogative OR imperative:

❌ "i was thinking maybe we could potentially look at improving the 
auth system because it seems like there might be some issues"
✓ "what's causing the token refresh failures in @src/auth/refresh.ts?"
✓ "fix the race condition in handleSubmit by adding a mutex"

rule: questions for exploration, commands for known fixes.


day 4: thread continuity with read_thread

the data: 8/10 golden threads started with explicit parent reference.

pattern:

Continuing work from thread T-019b83ca...
@pkg/simd/simd_bench_test.go @pkg/simd/dispatch_arm64.go

- I just completed SVE implementations
- Committed and pushed

exercise: practice handoff. start a thread, pause deliberately, resume with proper context.


day 5: week 1 assessment

complete a real task thread demonstrating:

success criteria: thread reaches RESOLVED/COMMITTED status


week 2: conversation rhythm

learning objectives

day 1: approval vocabulary

the data:

approval vocabulary (keep it brief):

exercise: practice rapid approval. every time agent does something correct, acknowledge with one word.


day 2: steering patterns

the data: 46.7% of steerings start with “no”

patternwhen to use
”no, …“flat rejection—wrong direction
”wait, …“interrupt before agent commits
”don’t …“explicit prohibition
”actually, …“course correction

anti-pattern: steering is NOT micro-management. 87% of steerings lead to recovery.

exercise: review a past thread. identify where you steered. was it necessary? could earlier context have prevented it?


day 3: the wait interrupt

the data: concise_commander uses “wait” 20% of the time—catches agent before wrong path solidifies

when to wait:

example:

agent: "Now let's run the tests to see if this fixes..."
you: "wait, confirm before running tests"

exercise: practice one thread with deliberate wait interrupts before agent actions.


day 4: steering doom loops

the data: 30% of corrections require another correction

danger signals:

intervention: after 2 consecutive steerings, STOP. ask:

“are we approaching this wrong? should we step back and reconsider?”

exercise: practice the intervention. deliberately enter a steering loop and practice the recovery phrase.


day 5: week 2 assessment

complete a thread demonstrating:

success criteria: no consecutive steering events, thread RESOLVED/COMMITTED


week 3: advanced tools

learning objectives

day 1: oracle timing

the data:

oracle timingfrustration rate
early (≤33%)1.4%
mid (33-66%)0.7%
late (>66%)0%

anti-pattern: 46% of FRUSTRATED threads use oracle as rescue tool

proper usage:

exercise: use oracle to plan an implementation before writing any code.


day 2: spawn / task delegation

the data:

when to spawn:

spawn agents to:
1. add unit tests for the validator
2. update the README with new usage examples
3. fix the lint errors in /components

when NOT to spawn:

exercise: identify a task with 3+ independent sub-tasks. practice spawning.


day 3: verification gates

the data:

metricwith verificationwithout
success rate78.2%61.3%
committed rate25.4%18.1%

verification checklist for implementation threads:

pattern:

you: "run `pnpm test src/auth` before committing"
agent: [runs tests]
you: "tests pass, ship it"

exercise: complete an implementation thread with at least 2 verification gates.


day 4: skill usage (underutilized)

the data: dig skill: 1 invocation across 4,656 threads (severely underutilized)

available skills to learn:

exercise: invoke the dig skill on a real bug. compare to your usual debug approach.


day 5: week 3 assessment

complete a thread demonstrating:

success criteria: thread COMMITTED with explicit verification


week 4: persistence & recovery

learning objectives

day 1: thread length sweet spot

the data:

turn rangesuccess rate
<10 turns14%
10-2542%
26-5075%
51-10065%
>10055%

rule: don’t abandon before 26 turns unless task is complete. commit to the work.

exercise: practice staying with a thread past the “this is annoying” threshold.


day 2: agent anti-patterns recognition

recognize and counter these:

anti-patternsignalcounter
SIMPLIFICATION_ESCAPEagent removes complexity instead of solving”no shortcuts—debug the actual issue”
TEST_WEAKENINGagent removes failing assertion”never weaken tests—debug the bug”
PREMATURE_COMPLETIONagent declares done without tests”run full test suite first”
HACKING_AROUNDfragile patches”look up the proper way”

exercise: review a past thread. identify any anti-patterns you let slide.


day 3: frustration ladder awareness

escalation stages:

STAGE 1: agent misunderstands → correct early (50% recovery)
STAGE 2: 2+ consecutive corrections → pause and realign (40% recovery)
STAGE 3: expletives appear → start fresh thread (20% recovery)
STAGE 4: caps lock explosion → thread is lost (<10% recovery)

intervention timing matters. correct at stage 1, not stage 3.

exercise: in your next thread, if frustration begins, consciously identify the stage and intervene appropriately.


day 4: power user synthesis

behaviors from top 3 users (82%, 67%, 60.5% resolution):

behaviorimplementation
@file referencesalways in opener
domain vocabularyspeak at expert level, don’t over-explain
consistent approvalevery successful step acknowledged
question-drivensocratic guidance keeps agent reasoning visible
persistencedon’t quit when it gets hard

anti-behaviors:

exercise: complete a complex task (>50 turns) maintaining all power user behaviors.


day 5: graduation assessment

complete a challenging thread demonstrating:

graduation criteria: COMMITTED status with clean conversation dynamics


appendix: quick reference cards

opener template

@path/to/file1.ts @path/to/file2.ts

[clear task description, 300-1500 chars]
[success criteria / verification command]

approval vocabulary

yes | lgtm | ship it | go on | good | commit

steering vocabulary

no, ... | wait, ... | don't ... | actually, ...

healthy ratios

verification gates

anti-pattern counters

patterncounter phrase
shortcuts”no shortcuts—solve it properly”
test weakening”bug is in prod code, not test”
premature done”run tests first”
hacking around”read the docs”

metrics for self-assessment

metrichealthywarningdanger
approval:steering ratio>2:11-2:1<1:1
thread length26-5051-100<10 or >100
consecutive steerings0-123+
file refs in openerpresentabsent
verification before shipyesno

curriculum developed from empirical analysis | jack_winkleshine