first message patterns

analysis of 4,281 threads with first user messages.

length vs outcome

length category	n	avg turns	avg steering	success rate
terse (<50 chars)	199	52.0	0.49	60.8%
brief (50-150)	612	47.5	0.42	62.6%
moderate (150-500)	1,303	39.6	0.24	54.7%
detailed (500-1500)	1,106	37.6	0.21	42.8%
extensive (1500+)	1,061	71.8	0.55	64.6%

observations

U-shaped success curve: brief and extensive messages outperform moderate ones.

brief messages (62.6% success): likely simple tasks. “fix this typo” needs no elaboration.
detailed messages (42.8% success, LOWEST): possibly over-specified but under-contextualized? enough complexity to require steering, not enough context to avoid it.
extensive messages (64.6% success): front-loaded context pays off despite longer threads (71.8 avg turns).

steering correlates with length extremes: terse (0.49) and extensive (0.55) messages lead to more steering than moderate ones (0.24). terse = underspecified, extensive = complex tasks.

specificity markers

marker	n	avg turns	avg steering	success rate
with file mentions (@)	2,349	56.6	0.39	66.7%
no file mentions	1,932	39.2	0.29	41.8%
continuations	1,239	62.9	0.47	57.2%
fresh starts	3,042	43.0	0.29	54.8%

key finding: FILE REFERENCES = +25% SUCCESS

threads starting with file references (@path/to/file.ts) have 66.7% success vs 41.8% without. this is the single strongest predictor in the dataset.

code blocks don’t help much (52.8% vs 56.5%) — possibly because pasting code without file context is less actionable than referencing files directly.

opening style

style	n	avg turns	avg steering	success rate
question (“how”, “what”, “why”)	169	32.8	0.27	62.1%
imperative (“fix”, “add”, “create”)	912	37.3	0.15	58.9%
continuation	1,502	53.8	0.40	49.3%
declarative (“i want”, “i need”)	54	53.4	0.52	53.7%
other	1,644	52.1	0.41	58.6%

observations

questions have highest success (62.1%) and low steering (0.27). exploratory threads may have clearer success criteria (“did it answer the question?”).
imperatives have lowest steering (0.15) — direct commands leave less room for misinterpretation.
continuations underperform (49.3%) despite explicit context passing. possible factors: inherited complexity, context loss between threads, tasks that were already struggling.

per-user patterns

user	threads	avg first msg len	avg turns	avg steering	success rate
@concise_commander	1,218	1,274	86.6	0.81	71.8%
@steady_navigator	1,171	1,255	36.5	0.10	67.0%
@verbose_explorer	875	1,519	39.1	0.28	43.2%
@precision_pilot	90	4,280	72.9	0.41	82.2%
@swift_solver	36	1,447	45.5	0.69	88.9%
@patient_pathfinder	150	608	20.3	0.20	54.0%
@feature_lead	146	1,246	20.7	0.08	26.0%

archetypes

@precision_pilot: marathon front-loader. avg 4,280 char first messages → 82.2% success. proves extensive context works if you commit.

@steady_navigator: efficient imperative user. moderate length (1,255), minimal steering (0.10), 67% success. threads end quickly (36.5 turns).

@concise_commander: high-steering marathoner. long threads (86.6 turns), high steering (0.81), but still 71.8% success. steers toward goal rather than abandoning.

@verbose_explorer: context front-loader. 932 char avg messages with extensive spawn orchestration. resolution rate corrected to 83% after fixing spawn misclassification (was 43.2%). handoff rate only 4.2%.

@feature_lead: lowest success (26.0%) despite low steering. short threads (20.7 turns) suggest early abandonment rather than resolution.

recommendations

include file references — @mentions boost success by 25 percentage points
brief OR extensive, not moderate — if the task is complex enough to explain, explain it fully
imperative > declarative — “fix X” outperforms “i want X fixed”
questions are underrated — exploratory threads have clearer success criteria

caveats

completion_status heuristics may misclassify some threads
success = RESOLVED + COMMITTED, which conflates “answered” with “deployed”
user sample sizes vary significantly (36 vs 1,218 threads)
“extensive” messages may include automated context injection, inflating length