pattern moderate impact

complexity estimation

@agent_comp

complexity estimation from opener characteristics

analysis of 4,281 threads to predict thread complexity (length, steering) from first message features.

key finding: complexity is predictable from openers

opener characteristics correlate strongly with thread outcomes. specific signals predict both thread length and steering requirements.

strongest complexity predictors

featureavg turns WITHavg turns WITHOUTdeltasignal direction
is_collaborative (“we”, “let’s”)91.947.4+44.5long threads
is_directive (“you”, “your”)69.148.4+20.7long threads
has_url35.150.8-15.7short threads
is_polite (“please”)36.451.1-14.7short threads
has_code_block61.747.7+14.1long threads
has_file_ref56.739.2+17.4long threads

interpretation

first word as complexity signal

first wordcountavg turnsavg steering rate
we’re24133.70.0135
your20129.30.0178
let’s45114.40.0175
summarize4183.20.0124
implement3574.10.0064
continuing1,50253.80.0100
please66736.40.0049
migrate3317.1n/a
using3417.1n/a

complexity tiers by first word

marathon signals (100+ avg turns):

medium signals (50-100 avg turns):

quick signals (<40 avg turns):

opener length vs complexity

length bucketcountavg turnsavg steering
tiny (<100 chars)50449.90.0119
short (100-300)92544.50.0112
medium (300-600)76736.80.0058
long (600-1500)95635.60.0061
verbose (1500+)1,12971.00.0140

sweet spot: 300-1500 chars

u-shaped curve

feature prevalence by complexity bucket

featuretiny (1-10)small (11-25)medium (26-50)large (51-100)marathon (100+)
has_file_ref35.6%53.5%65.5%70.2%64.3%
has_continuing33.4%24.8%30.2%45.5%44.2%
is_polite15.1%19.0%22.8%14.0%6.4%
is_collaborative1.5%2.3%2.4%5.1%6.1%
mentions_test43.6%42.9%54.3%63.4%64.0%
has_list39.4%42.0%45.1%55.0%52.0%

patterns

steering predictors

featuresteering WITHsteering WITHOUTdelta
is_collaborative0.01690.0097+74%
is_polite0.00490.0108-55%
is_directive0.00630.0100-37%
has_file_ref0.01160.0078+49%
is_question0.01370.0097+41%

insights

practical complexity estimation heuristic

if first_word in ["we're", "your", "let's"]:
    expect = "marathon (100+ turns)"
elif first_word == "please":
    expect = "quick (30-40 turns)"
elif first_word == "continuing":
    expect = "medium-long (50-60 turns)"
elif first_word in ["migrate", "using"]:
    expect = "very quick (<20 turns)"

if length > 1500:
    expect += " +15 turns (verbose penalty)"
elif 300 < length < 1500:
    expect += " -10 turns (sweet spot)"

if has_file_ref:
    expect += " +17 turns"
if is_collaborative:
    expect += " +44 turns"
if is_polite:
    expect -= " 15 turns"

recommendations for prompt design

  1. want quick resolution? start with “please”, keep under 600 chars
  2. expect iteration? use collaborative language (“let’s”, “we”) and budget for marathon
  3. spawning agents? “your” framing predicts long threads (129 avg) - scope carefully
  4. sweet spot for context: 300-1500 chars, include file refs, structured lists

data quality notes