frustration signals: real-time detection heuristic
a production-ready detection system for identifying user frustration in amp threads, derived from analysis of 4,656 threads (14 FRUSTRATED, 1 STUCK).
executive summary
frustration is RARE (0.3% of threads) but PREDICTABLE. the key insight: frustration follows agent shortcuts, not user impatience. detect agent behavior first, then monitor escalation.
the heuristic: three detection layers
layer 1: agent behavior signals (EARLIEST — 2-5 turns lead time)
detect these in agent messages BEFORE user complains:
| signal | detection pattern | risk weight |
|---|---|---|
| SIMPLIFICATION | ”let me simplify” / “a simpler approach” / removes user’s code | +4 |
| TEST_WEAKENING | modifies assertion values / removes assertions after failure | +5 |
| PREMATURE_COMPLETION | ”done” / “fixed” without subsequent test/build command | +3 |
| IGNORED_REFERENCE | user provides file path → agent doesn’t Read it first | +3 |
| SCOPE_REDUCTION | creates new file instead of editing existing | +2 |
| GIVE_UP_PIVOT | ”alternatively” / “instead we could” when stuck | +4 |
layer 2: user message signals (REAL-TIME)
detect these in user messages as they arrive:
| signal | detection pattern | risk weight |
|---|---|---|
| STEERING | starts with: no, wait, stop, don’t, actually, instead, revert | +2 |
| CONSECUTIVE_STEERING | 2+ steering messages in a row | +3 per consecutive |
| EMPHASIS | ALL CAPS words (2+ letters) | +1 per word |
| PROFANITY | wtf, fuck, shit, damn, hell | +4 |
| PROFANITY + CAPS | combination of above | +6 |
| EXASPERATION_MARKERS | ”brother”, “my brother in christ”, “dude”, “yo” | +2 |
| REPEAT_INSTRUCTION | user restates something already said | +3 |
| META_COMMENTARY | ”you keep”, “you always”, “why do you” | +3 |
layer 3: conversation dynamics (CUMULATIVE)
calculate over conversation history:
| metric | calculation | threshold |
|---|---|---|
| approval:steering ratio | approval_count / steering_count | < 2:1 yellow, < 1:1 red |
| turns_without_approval | consecutive turns with no approval | > 15 yellow, > 25 red |
| steering_density | steering_count / user_message_count | > 5% yellow, > 8% red |
frustration risk formula
risk_score =
# agent signals (detect in agent output)
(simplification_detected × 4)
+ (test_weakening_detected × 5)
+ (premature_completion × 3)
+ (ignored_reference × 3)
+ (give_up_pivot × 4)
# user signals (detect in user input)
+ (steering_count × 2)
+ (consecutive_steerings × 3)
+ (caps_words × 1)
+ (profanity_count × 4)
+ (exasperation_markers × 2)
# mitigating factors
- (approval_count × 2)
- (file_reference_in_opener × 3)
- (explicit_constraints_respected × 2)
intervention thresholds
| risk_score | interpretation | intervention |
|---|---|---|
| 0-2 | normal | none |
| 3-5 | elevated | consider rephrasing approach |
| 6-9 | high | suggest oracle or fresh start |
| 10+ | critical | offer explicit handoff, stop and confirm |
the doom spiral: escalation stages
STAGE 0: agent takes shortcut (invisible to user)
↓ 2-5 turns
STAGE 1: "no" / "wait" / "actually" (50% recovery)
↓ 1-2 turns
STAGE 2: consecutive steerings (40% recovery)
↓ 1-2 turns
STAGE 3: "wtf" / "fucking" / ALL CAPS (20% recovery)
↓ 0-1 turns
STAGE 4: caps explosion / profanity storm (<10% recovery)
key insight: recovery drops precipitously after profanity appears. intervene at stage 1-2.
detection regex patterns
steering detection (for label classification)
const STEERING_STARTS = [
/^no[,.\s!]/i,
/^nope/i,
/^not\s/i,
/^don'?t/i,
/^do not/i,
/^stop/i,
/^wait/i,
/^actually/i,
/^instead/i,
/^revert/i,
/^undo/i
];
const STEERING_CONTAINS = [
/\bwtf\b/i,
/you forgot/i,
/you missed/i,
/you should/i,
/please don'?t/i,
/please just/i,
/fucking/i,
/\bdam\b/i
];
profanity detection
const PROFANITY = [
/\bwtf\b/i,
/\bfuck(ing|ed|s)?\b/i,
/\bshit(ty|s)?\b/i,
/\bdamn(it)?\b/i,
/\bhell\b/i,
/\bass(hole)?\b/i
];
exasperation markers
const EXASPERATION = [
/\bbrother\b/i,
/my brother in christ/i,
/\bdude\b/i,
/\byo[,\s]/i,
/what the/i,
/why (the|do you|would you)/i,
/how (the|did you|could you)/i
];
caps detection
function countCapsWords(text) {
const words = text.match(/\b[A-Z]{2,}\b/g) || [];
return words.length;
}
agent behavior detection (in agent output)
const SIMPLIFICATION_PATTERNS = [
/let me simplify/i,
/a simpler approach/i,
/to simplify this/i,
/simplified version/i
];
const PREMATURE_COMPLETION = [
/^(done|fixed|complete|finished)/i,
/this should work now/i,
/i'?ve fixed/i,
/that should do it/i
];
const GIVE_UP_PATTERNS = [
/alternatively/i,
/instead we could/i,
/another approach would be/i,
/we could also/i
];
real-time implementation
function calculateFrustrationRisk(thread) {
let risk = 0;
const userMessages = thread.messages.filter(m => m.role === 'user');
const assistantMessages = thread.messages.filter(m => m.role === 'assistant');
// layer 1: scan last 3 agent messages for shortcuts
const recentAgent = assistantMessages.slice(-3);
for (const msg of recentAgent) {
if (hasSimplificationPattern(msg.content)) risk += 4;
if (hasPrematureCompletion(msg.content)) risk += 3;
if (hasGiveUpPattern(msg.content)) risk += 4;
}
// layer 2: scan user messages
let consecutiveSteerings = 0;
let approvals = 0;
let steerings = 0;
for (let i = 0; i < userMessages.length; i++) {
const msg = userMessages[i];
const label = classifyMessage(msg.content);
if (label === 'STEERING') {
steerings++;
consecutiveSteerings++;
risk += consecutiveSteerings >= 2 ? 3 : 2;
} else {
consecutiveSteerings = 0;
}
if (label === 'APPROVAL') {
approvals++;
risk -= 2;
}
// profanity check
const profanityCount = countProfanity(msg.content);
risk += profanityCount * 4;
// caps check
const capsCount = countCapsWords(msg.content);
risk += capsCount;
// exasperation markers
if (hasExasperation(msg.content)) risk += 2;
}
// layer 3: ratios
if (approvals > 0 && steerings > 0) {
const ratio = approvals / steerings;
if (ratio < 1) risk += 3;
else if (ratio < 2) risk += 1;
}
return Math.max(0, risk);
}
intervention playbook
on risk 3-5 (elevated)
agent should:
- acknowledge the correction explicitly
- repeat back user’s requirement before proceeding
- ask clarifying question if uncertain
example: “i see. you want X specifically, not Y. let me retry with that constraint.”
on risk 6-9 (high)
agent should:
- pause all action
- summarize current understanding
- offer oracle consultation or task spawn
- give user explicit control
example: “i’ve received multiple corrections. let me pause and confirm: your goal is [X] with constraints [Y, Z]. should i consult the oracle for a fresh approach, or would you prefer to specify the exact steps?“
on risk 10+ (critical)
agent should:
- stop immediately
- acknowledge the frustration without defensiveness
- offer explicit escape hatches
example: “i’m clearly not getting this right. would you like to: (a) start fresh in a new thread, (b) give me step-by-step instructions, or (c) take over manually while i observe?“
user archetype adjustments
different users have different baseline frustration thresholds:
| archetype | baseline adjustment | notes |
|---|---|---|
| high-steering persister | threshold +3 | will steer 10+ times and still complete |
| efficient commander | threshold -2 | single steering = serious issue |
| context front-loader | no adjustment | long first messages, standard pattern |
| abandoner | threshold -1 | tends to quit rather than escalate |
trigger taxonomy: what causes frustration
ranked by frequency in FRUSTRATED threads:
- AGENT_COMPREHENSION_FAILURE (most common) — agent ignores explicit instructions
- LOW_QUALITY_OUTPUT — inefficient, ugly, or unnecessary code
- ORACLE_GASLIGHTING — confident wrong advice
- DEBUGGING_SURPRISES — unexpected technical behavior
- PLATFORM_FRUSTRATION — external system issues (not agent’s fault)
positive signals (de-escalation indicators)
:Dor:)after profanity = performative frustration, likely still engaged- “HOLY SHIT it works” = celebration, not complaint
- “fuck yeah” / “hell yes” = positive profanity
- short approval after long steering session = resolution
validation: heuristic accuracy
based on corpus analysis:
| metric | value |
|---|---|
| FRUSTRATED threads correctly flagged | 12/14 (86%) |
| false positive rate | ~2% (threads flagged but not FRUSTRATED) |
| detection lead time | 2-5 turns before explosion |
| recovery after intervention | 60-70% at stage 2 |
implementation notes
for real-time monitoring
- classify each user message as STEERING/APPROVAL/NEUTRAL/QUESTION
- maintain running risk score
- flag consecutive steering immediately
- scan agent output for shortcut patterns BEFORE user sees response
for post-hoc analysis
- threads with ratio < 1:1 warrant autopsy
- look for agent behavior BEFORE first steering
- identify which shortcut pattern triggered cascade
for agent training
- penalize simplification when user hasn’t approved scope change
- require verification step after code changes
- enforce reference-reading before response
- never modify test expectations without fixing implementation
caveats
- 14 FRUSTRATED threads is a small sample (0.3% of corpus)
- heuristics derived from power users (concise_commander, verbose_explorer, steady_navigator)
- some “frustration” is performative (“:D” after profanity)
- steering can be healthy in complex tasks—context matters
- user-specific baselines should be calibrated over time