AMP THREAD QUALITY DASHBOARD

4,656 threads analyzed | metrics derived from MEGA-SYNTHESIS

🎯 OUTCOME DISTRIBUTION

status	count	%
RESOLVED	2,745	59%
UNKNOWN	1,517	33%
COMMITTED	175	4%
EXPLORATORY	125	3%
HANDOFF	75	2%
FRUSTRATED	10	<1%
PENDING	8	<1%
STUCK	1	<1%

note: prior analysis miscounted spawned subagent threads as HANDOFF. corrected 2026-01-09.

📊 KEY THRESHOLDS

thread length (turns)

zone	turns	success rate	signal
🔴 TOO SHORT	<10	14%	abandoned/unclear
🟡 WARMING UP	10-25	~50%	building context
🟢 SWEET SPOT	26-50	75%	optimal resolution
🟡 LONG	51-100	~60%	complexity overhead
🔴 TOO LONG	>100	↓	frustration risk

approval:steering ratio

ratio	outcome	interpretation
🟢 >4:1	COMMITTED	clean execution
🟢 2-4:1	RESOLVED	healthy balance
🟡 1-2:1	STRUGGLING	needs attention
🔴 <1:1	FRUSTRATED	doom spiral

steering density

threshold	status
🟢 <5%	healthy
🟡 5-8%	warning
🔴 >8%	critical

✍️ PROMPT QUALITY SIGNALS

prompt length (chars)

range	steering rate	status
🔴 <100	high	too terse
🟡 100-299	moderate	borderline
🟢 300-1500	0.20-0.21	OPTIMAL
🟡 >1500	elevated	over-specified

context anchors

signal	impact
🟢 file refs (`@path`)	+25pp success (66.7% vs 41.8%)
🟢 interrogative style	69.3% success vs 46.4% raw
🟢 descriptive-action	73.9% resolution
🔴 raw directives	46.4% resolution

question density

threshold	outcome
🟢 <5%	76% resolution
🟡 5-15%	normal
🔴 >15%	excessive clarification

🔧 TOOL & PROCESS METRICS

task delegation

task count	resolution	status
🟡 1	~65%	underutilized
🟢 2-6	77-79%	OPTIMAL
🟡 7-10	~70%	diminishing returns
🔴 11+	58%	over-delegated

verification gates

signal	success rate
🟢 with verification	78.2%
🔴 without verification	61.3%

error handling

metric	value	interpretation
workaround rate	71.6%	agents suppress vs fix
error-free success	97.8%	errors = real work

⏰ TEMPORAL PATTERNS

time of day

window	resolution	status
🟢 2-5am	~60%	late night flow
🟢 6-9am	~60%	fresh morning
🟡 10am-5pm	~45%	workday avg
🔴 6-9pm	27.5%	WORST

collaboration intensity

msgs/hr	success	status
🟢 <50	84%	deliberate pace
🟡 50-200	~70%	active
🟡 200-500	~60%	intense
🔴 >500	55%	too rushed

day of week

day	delta
🟢 weekend	+5.2pp vs weekday

⚠️ EARLY WARNING SIGNALS

doom spiral indicators

signal	threshold	action
steering→steering	30% transition	PAUSE & REALIGN
2+ consecutive steers	any	deep misalignment
WTF rate	>10%	frustration brewing
oracle late-stage	-	rescue attempt

recovery stats

metric	rate
single steer recovery	87%
with ANY approval	94% persistence
without approval	49% persistence

🚫 FAILURE ARCHETYPES

pattern	description
PREMATURE_COMPLETION	declaring done too early
OVER_ENGINEERING	adding unrequested complexity
HACKING_AROUND	suppressing vs fixing
IGNORING_PATTERNS	not matching codebase style
NO_DELEGATION	doing everything inline
TEST_WEAKENING	modifying tests to pass
NOT_READING_DOCS	skipping documentation

📐 COMPLIANCE REALITY

instruction type	compliance
polite requests	12.7%
prohibitions (don’t/never)	20%

what works

pattern	resolution
🟢 descriptive-action	73.9%
🟡 echo-then-act	54.0%
🔴 raw-action	46.4%

🏆 SUCCESS FORMULA

SUCCESS = file_refs + interrogative_style + 300-1500_chars
        + 2-6_tasks + verification + <50_msgs_hr
        + approval:steering > 2:1 + 26-50_turns

golden thread profile

starts with @file reference
300-1500 char opening prompt
interrogative or descriptive-action style
2-6 delegated tasks
verification gates present
approval:steering ratio >2:1
resolves in 26-50 turns
<5% question density
<5% steering density

dashboard generated from 4,656 amp threads