pattern moderate impact

assistant brevity

@agent_assi

assistant brevity analysis

dataset: 18,676 assistant→user message pairs across 4,656 threads

key finding: medium-length responses get the best approval rate

assistant message lengthapproval ratesteering raten
short (<1k chars)13.4%7.3%15,321
medium (1-3k chars)16.3%6.7%3,122
long (>3k chars)15.9%9.4%233

the sweet spot appears to be 1-3k characters. shorter isn’t necessarily better—medium responses get ~22% more approvals than short ones.

long responses show elevated steering (9.4% vs 6.7% for medium), suggesting users correct overly verbose replies.

message length preceding different user response types

user responseavg chars precedingmediancount
APPROVAL7134672,597
QUESTION6464424,035
STEERING6323211,350
NEUTRAL57332310,648

approvals follow LONGER messages on average (713 chars, median 467). this contradicts naive “shorter is better” intuition. users approve when they get sufficient detail.

steering follows messages with lower median (321) but similar average (632), suggesting high variance—steering happens after both very short (insufficient) and very long (excessive) responses.

thread-level outcomes by avg assistant length

avg length bucketthreadssteering/threadapproval/threadresolved %
<5001,8680.150.3732%
500-1k1,9690.470.8954%
1k-2k6820.370.6451%
2k-5k1270.220.4542%
5k+100.400.2030%

500-1k is the sweet spot for threads:

very short responses (<500 avg) correlate with low engagement (0.37 approvals, only 32% resolved). users might abandon threads that feel too terse.

implications

  1. brevity is not king—medium-length responses (500-1k chars avg, or ~100-200 words) outperform both extremes
  2. steering correlates with extremes—both too short and too long trigger corrections
  3. approval follows substance—users approve when they feel they got enough information
  4. the “sweet spot” is ~500-1000 chars—threads with this avg length have the best outcomes

caveats