recovery patterns: steering → resolution

analysis of 552 threads that received STEERING corrections but ended RESOLVED.

headline finding

62% of steered threads recover. steering is not a death sentence—it’s often a productive course correction.

outcome	count	%
RESOLVED	552	62.2%
UNKNOWN	166	18.7%
COMMITTED	81	9.1%
HANDOFF	72	8.1%
FRUSTRATED	14	1.6%

what enables recovery

1. runway after correction

most recovered threads have significant runway AFTER the last steering event:

turns after last steering	threads
30+	311 (57%)
16-30	125 (23%)
6-15	91 (17%)
0-5	16 (3%)

insight: recovery requires iteration time. ~80% of recovered threads had 16+ turns after the last correction.

2. steering → approval transition

temporal analysis of user message sequences in recovered threads:

transition	count
APPROVAL → APPROVAL	435
STEERING → APPROVAL	360
APPROVAL → STEERING	348
STEERING → STEERING	228

key pattern: STEERING → APPROVAL transition happens 360 times. users correct, agent adjusts, user confirms. the 1.6:1 ratio of (STEERING→APPROVAL) to (STEERING→STEERING) suggests agents typically respond well to correction.

3. approval density correlates with recovery

among recovered threads:

approval:steering ratio	threads
no approvals	178
balanced (1-2x)	156
high (2x+)	142
medium (0.5-1x)	49
low (< 0.5x)	27

178 threads recovered without explicit approvals—suggests implicit progress (agent just fixed the issue without explicit “good job”).

4. steering type matters

in recovered threads:

steering type	count
other_correction	382
wait/pause	160
questioning	113
prohibition (don’t)	87
emphatic_no (no no no)	81
nope	38
wtf	32
stop	21

in frustrated threads (14 total, 24 steering msgs):

steering type	count
wtf	8 (33%)
other_correction	8 (33%)
emphatic_no	3

contrast: WTF comprises only 3.5% of resolved steering but 33% of frustrated steering. escalated emotional language correlates with non-recovery.

recovery mechanics (from message samples)

common patterns in successful corrections:

specific redirection: “No no no, just use the keyVector directly” → gives concrete alternative
pause + clarify: “Wait, why only primary key?” → stops action, asks for explanation
debug methodology: “Nope. Debug it methodically. Printlns” → redirects approach not goal
scope constraint: “No comparisons. The rest, do it” → removes part of scope, keeps core
reference grounding: “No, look at the existing code in X” → points to authoritative source

what distinguishes frustrated from recovered

factor	RESOLVED (n=552)	FRUSTRATED (n=14)
avg steering count	1.71	1.71
wtf rate	3.5%	33%
avg turns	higher	similar

steering count is identical—but emotional intensity differs sharply.

implications

correction ≠ failure: majority of steered threads succeed
runway matters: plan for iteration after correction; most recoveries need 16+ turns
emotional escalation predicts failure: wtf/emphatic language is a warning sign
specific > general: corrections that give concrete alternatives recover better
the steering→approval cycle is healthy: normal productive pattern, not pathological