methodology

how we analyzed 4,656 amp threads to extract actionable patterns.

data collection

the corpus includes all amp threads from the engineering team between may 2025 and january 2026. threads were exported via the amp api, including 864 local-only threads that hadn't synced to the server.

4,656 threads

208,799 messages

20 users

864 local-only unsynced threads

analysis approach

113 spawned agents processed the corpus in parallel, each assigned a specific analysis dimension. agents ran labeling passes (extracting steering events, approvals, outcomes) and statistical aggregation across threads.

→ outcome labeling: threads classified as resolved, frustrated, handoff, committed, exploratory, or unknown
→ steering detection: pattern matching for corrections ("no...", "wait...", "don't...")
→ approval detection: explicit confirmations ("ship it", "commit", "good")
→ user attribution: mapping thread ownership via users.json

output

103 insight files totaling 784KB, covering:

• thread flow patterns
• steering taxonomy
• user archetypes
• time-of-day effects
• tool usage chains
• prompting styles
• learning curves
• failure archetypes
• domain expertise
• oracle timing

anonymization

all user identities have been replaced with descriptive pseudonyms that reflect their observed interaction patterns:

@concise_commander — terse prompts, high resolution rate
@verbose_explorer — detailed context, exploratory style
@steady_navigator — consistent patterns, balanced approach
@precision_pilot — high accuracy, targeted corrections
@patient_pathfinder — persistent debugging, thorough exploration
@swift_solver — quick resolutions, efficient sessions

thread IDs remain intact as they're protected by amp's workspace sharing permissions.

errata: spawn misclassification (resolved)

an initial labeling pass incorrectly classified spawned subagent threads as "handoffs" or "failures." this has been corrected—spawn continuations are now properly attributed to parent threads.

all stats on this site reflect the corrected analysis.

limitations

this analysis reflects one team's usage patterns. findings may not generalize to:

× different codebases or tech stacks
× teams with different communication styles
× newer amp versions with different behaviors

causal claims are labeled as such. correlation ≠ causation applies throughout.

confidence levels

high confidence

structural patterns (turn counts, ratios), user archetype consistency, steering taxonomy percentages

hunch territory

causal direction between oracle usage and frustration, whether terse style causes success or reflects expertise, optimal confirmation frequency