persistence vs abandonment analysis

what distinguishes threads that persist through difficulty vs those that abandon?

headline findings

the strongest predictor of persistence is approval frequency, not steering avoidance.

approval pattern	threads	persist rate	avg turns
many (6+)	59	96.6%	231
moderate (3-5)	289	95.5%	128
few (1-2)	1,103	93.7%	69
none	3,205	49.4%	26

threads with ANY approval signal persist ~94% of the time. threads with zero approvals—the user never said “ok”, “yes”, “proceed”, “good”—persist only 49%.

the recovery ratio

when threads DO have steering (corrections), the ratio of approvals to steers predicts outcome:

recovery pattern	threads	persist rate	description
strong_recovery	224	94.6%	approvals ≥2x steers
recovered	243	84.4%	approvals ≥ steers
partial_recovery	111	78.4%	some approval, less than steers
no_recovery	310	64.8%	steered but no approval after
no_steering	3,768	59.6%	never steered

key insight: steering with recovery (approval follows correction) has HIGHER persistence than never steering at all. the correction itself isn’t the problem—lack of recovery is.

length as persistence signal

longer threads persist more, but causation is tricky—maybe they’re long BECAUSE they persisted.

length	persisted	abandoned	unclear	persist rate
60+ turns	1,130	14	106	90.4%
31-60	619	4	93	86.5%
16-30	484	1	151	76.1%
6-15	533	2	513	50.9%
1-5	183	2	821	18.2%

short threads (<10 turns) are mostly UNCLEAR outcome—likely exploratory questions where persistence isn’t the right frame.

steering timing matters

when does first steering occur? outcomes differ:

first steer timing	RESOLVED	COMMITTED	HANDOFF	FRUSTRATED
early (1-5 turns)	76	11	6	0
mid (6-15 turns)	82	13	11	0
late (16-30 turns)	100	19	17	0
very late (30+)	285	34	35	11

frustration clusters in very late steering (30+ turns). early steering doesn’t predict abandonment—it’s a course-correction that often leads to resolution.

user traits and persistence

user	threads	persist rate	avg turns	steers %	marathon %
@swift_solver	36	97.2%	46	44%	36%
@precision_pilot	90	87.8%	73	30%	63%
@concise_commander	1,219	85.3%	87	44%	69%
@verbose_explorer	875	—	39	17%	21%
@steady_navigator	1,171	68.7%	37	9%	23%
@patient_pathfinder	150	54.7%	20	16%	6%

high-persistence users (@swift_solver, @concise_commander, @precision_pilot) share traits:

high marathon rate (60+ turn threads): willingness to push through
higher steering rate: more active correction = more engagement
longer avg threads: don’t quit early

shorter-thread users (@steady_navigator, @patient_pathfinder):

shorter threads on average
lower steering engagement
possible explanation: different task types, delegation preferences, or lower tolerance for agent mistakes

NOTE: @verbose_explorer was previously listed here but that classification was based on corrupted spawn data. with corrected stats (83% resolution, 4.2% handoff), @verbose_explorer’s persistence profile is unclear and needs reanalysis.

engagement patterns by length

length	engagement type	RESOLVED	COMMITTED	UNKNOWN
long (30+)	both steer+approve	363	71	58
long (30+)	approve only	420	92	—
long (30+)	steer only	147	—	60
long (30+)	no engagement	451	13	81
short (<10)	no engagement	149	12	1,013

in long threads: active engagement (steering AND approval) has best committed rate. passive long threads (no signals) still resolve but rarely commit—maybe because the user isn’t confirming work is done.

in short threads: no-engagement is overwhelmingly UNKNOWN. short threads without user feedback simply don’t have enough signal to classify.

marathon thread (60+) outcomes

outcome	count	avg steers	avg approvals	approve/steer ratio
RESOLVED	889	0.88	1.67	1.91
COMMITTED	103	0.94	2.90	3.08
HANDOFF	155	0.59	1.26	2.12
FRUSTRATED	9	2.11	1.11	0.53
UNKNOWN	108	1.71	0.81	0.48

frustrated marathon threads have TWICE the steering rate of resolved ones (2.11 vs 0.88) and HALF the approval ratio (0.53 vs 1.91). the pattern: repeated correction without acknowledgment of progress.

the frustrated 14

examining threads that ended in FRUSTRATED state:

thread	user	turns	steers	approvals	title snippet
T-019b2dd2…	@verbose_explorer	160	1	1	scoped context isolation vs oracle
T-fa176ce5…	@concise_commander	133	2	0	debug TestService registration error
T-05aa706d…	@steady_navigator	127	3	1	resolve deploy_cli module import error
T-019b03ba…	@concise_commander	124	2	2	fix this
T-019b9a94…	@precision_pilot	113	1	0	fix concurrent append race conditions
T-ab2f1833…	@concise_commander	109	4	3	storage_optimizer trim race condition

pattern: LONG threads (80-160 turns) on DIFFICULT debugging tasks. frustration comes at the end of marathon sessions on stubborn bugs, not from initial task misalignment.

persistence predictors (ranked)

approval frequency — ANY approval signal predicts ~94% persistence
recovery ratio — approval/steer ratio >1.0 predicts success after correction
thread length — longer threads persist more (selection bias: they’re long because they persisted)
user marathon rate — users who regularly run 60+ turn threads persist more
steering WITH recovery — steering followed by approval = healthy engagement

anti-patterns

steering without recovery — correction with no subsequent approval (64.8% persist vs 94.6% with strong recovery)
no engagement — zero approvals, zero steers (49.4% persist)
late frustration — first steering at 30+ turns correlates with FRUSTRATED outcome
high steer:approve in marathons — ratio <0.5 in 60+ turn threads signals trouble

recommendations

prompt for explicit approval checkpoints — don’t assume silence is consent
track approval/steer ratio — if ratio falls below 1.0, consider user friction intervention
watch marathon threads — threads >100 turns with no recent approval are at risk
early steering is GOOD — don’t treat corrections as failures; they predict engagement
user-specific thresholds — @concise_commander persists through heavy steering; others may need lighter touch