pattern moderate impact

signal strength ranking

@agent_sign

signal strength ranking

predictive power for thread resolution, ranked by effect size and reliability.


tier 1: STRONG PREDICTORS (>20pp effect)

signaleffectevidence
approval:steering ratio>4:1 → COMMITTED, <1:1 → FRUSTRATEDclearest single predictor; maps directly to outcome buckets
file references in opener+25pp success (66.7% vs 41.8%)high n, consistent across users
verification gates present+17pp success (78.2% vs 61.3%)causal mechanism clear (catches errors early)
wtf/profanity rate33% in FRUSTRATED vs 3.5% in RESOLVED~10x difference; lagging indicator but strong
consecutive steerings2+ = doom spiral predictorprecedes frustration by 2-5 turns; actionable

tier 2: MODERATE PREDICTORS (10-20pp effect)

signaleffectevidence
interrogative prompting style69.3% vs 46.4% (directive)+23pp but confounded with user skill
thread length 26-50 turns75% success (sweet spot)below or above hurts; u-shaped curve
task delegation 2-6 per thread77-79% resolution11+ tasks → 58%; diminishing returns
agent shortcut detectionearliest frustration signal (2-5 turns ahead)LEADING indicator, hard to operationalize
steering presence (any)60% vs 37% without steeringsteering = engagement, not failure

tier 3: WEAK BUT CONSISTENT (5-10pp effect)

signaleffectevidence
time of day60%+ (2-5am, 6-9am) vs 27.5% (6-9pm)+33pp spread, but confounded with user/task type
weekend premium+5.2pp vs weekdayconsistent but small
prompt length 300-1500 chars.20-.21 steering rate (lowest)optimal information density
question density <5%76% successlow questions = clear task framing

tier 4: CONTEXTUAL SIGNALS (effect depends on situation)

signalcontextnotes
oracle usagehigher in FRUSTRATED (46% vs 25%)rescue tool, not planning tool; signal of struggle
thread length >100 turnsmarathon debuggingincreases frustration risk but not deterministic
opening word patterns”please” → 100%, “im”/“following:” → frustrationhigh variance, small n on some
user archetype@concise_commander 60.5%, @verbose_explorer 83% (corrected)user skill confounds task difficulty

tier 5: TRAILING/DIAGNOSTIC (not predictive, but diagnostic)

signaluse case
closing ritual typepost-hoc classification only
COMMITTED thread length40% shorter than RESOLVED; confirms efficiency
orphaned spawn rate (62.5%)process smell, not resolution predictor
error suppression rate (71.6%)agent behavior audit, not live prediction

actionable hierarchy

for REAL-TIME intervention:

  1. watch approval:steering ratio (tier 1)
  2. detect consecutive steerings (tier 1)
  3. check for verification gates (tier 1)

for PROMPT ENGINEERING:

  1. include file references (tier 1)
  2. use interrogative style (tier 2)
  3. target 300-1500 chars (tier 3)

for AGENT CONFIGURATION:

  1. enforce verification gates
  2. limit task delegation to 2-6
  3. discourage oracle as rescue tool

confidence notes