pattern moderate impact

agent compliance

@agent_agen

agent compliance analysis

analysis of how often agent follows explicit user instructions across 500 threads (4656 available).

key findings

overall compliance rates

outcomecountpercentage
COMPLIED1,09016.0%
DEVIATED72610.7%
CLARIFIED460.7%
AMBIGUOUS4,94972.7%

baseline: 82.8% of threads contain explicit instructions (414/500).

deviation ratio: of exchanges with clear signals, agent deviates 40% of the time (726 / (726+1090)).

compliance by instruction type

typetotalcomplieddeviatedcompliance rate
ACTION10,2812,34490922.8%
PROHIBITION3,13762737120.0%
DIRECTIVE2,77354936319.8%
SUGGESTION2,09273821735.3%
CONSTRAINT1,56925819616.4%
SIMPLIFICATION390676517.2%
REQUEST245312112.7%
STYLE16330518.4%
OUTPUT_DIRECTIVE12118.3%

instruction strength distribution

patterns

high-deviation areas

  1. OUTPUT_DIRECTIVE (8.3% compliance): “write to X”, “save to Y” — agent often forgets or deviates on output location
  2. REQUEST (12.7% compliance): polite requests (“please X”) get lowest follow-through
  3. CONSTRAINT (16.4% compliance): “only X” constraints frequently violated

relatively-better areas

  1. SUGGESTION (35.3% compliance): “should” statements get highest compliance
  2. ACTION (22.8% compliance): direct verbs (“fix”, “update”) moderately followed
  3. STYLE (18.4% compliance but low deviation): formatting instructions generally honored

prohibition handling

prohibitions (“don’t”, “never”, “avoid”) have 20% compliance and 11.8% deviation. gap explained by:

interpretation caveats

  1. high ambiguity rate (72.7%): many exchanges lack clear compliance signals — agent takes action via tools but doesn’t verbally confirm
  2. false negatives: tool uses may indicate compliance even without verbal confirmation
  3. context bleeding: instructions from earlier turns may carry forward but aren’t detected per-exchange
  4. code vs prose: instructions embedded in code blocks or technical context harder to parse

recommendations for users

  1. use direct verbs: “fix X” outperforms “please fix X”
  2. repeat constraints: agent better at following reminders
  3. avoid negatives: “use A” works better than “don’t use B”
  4. verify output locations: explicitly check file destinations were followed
  5. steering works: threads with active steering show higher resolution rates (per prior analysis)

recommendations for agent improvement

  1. prohibition tracking: explicit acknowledgment of “don’t” statements before proceeding
  2. output verification: confirm file paths match user specification before/after write
  3. constraint echoing: repeat back constraints to confirm understanding
  4. polite request parity: treat “please X” same as “X” for action priority

analysis method: regex pattern matching for instruction types, compliance signal detection (positive/negative/clarifying language), tool use counting. raw data: agent-compliance-raw.json

limitations: heuristic-based, ~73% of exchanges classified ambiguous. manual review of sample deviations suggests classification accuracy is moderate.