quick wins: 5 highest-impact, lowest-effort changes
ranked by effect size × ease of implementation.
1. include file references in opening message
effect: +25pp success rate (66.7% vs 41.8%)
effort: zero — just type @path/to/file.ts
source: first-message-patterns.md
this is the single strongest predictor in the entire dataset. works because it anchors the agent to concrete code rather than abstract requirements.
do this: start threads with explicit file mentions. e.g., @src/auth/login.ts needs error handling for expired tokens
2. use interrogative style (ask questions)
effect: +13pp over contextual, +23pp over hybrid (69.3% vs 56.7% directive, 51.5% contextual, 42.2% hybrid)
effort: zero — reframe commands as questions
source: prompting-styles.md
counterintuitive: asking “how should we handle X?” outperforms commanding “fix X” for complex tasks. questions force the agent to reason before acting.
do this: for non-trivial tasks, lead with a question. how would you approach caching for this endpoint? before add caching.
3. stay past 10 turns
effect: +61pp success (75% at 26-50 turns vs 14% at <10 turns)
effort: low — just don’t abandon early
source: length-analysis.md, ULTIMATE-SYNTHESIS.md
threads < 10 turns fail 86% of the time. most are abandoned, not completed. the sweet spot is 26-50 turns (75% success).
do this: if a task matters, commit to at least 15-20 turns before deciding it’s not working.
4. approve explicitly (target 2:1 approval:steering)
effect: 4x success rate when ratio > 2:1 vs < 1:1
effort: trivial — type “good”, “ship it”, “yes”
source: conversation-dynamics.md, thread-flow.md
explicit approvals (“good”, “yes”, “ship it”) create checkpoints the agent can anchor to. without them, agents drift and require more steering.
do this: approve after each successful step. even “yup” counts. aim for 2 approvals per steering.
5. confirm before tests/benchmarks
effect: 47% of steerings are “no…”, 17% are “wait…” — most correct premature action
effort: trivial for agent behavior — add to AGENTS.md
source: steering-deep-dive.md, steering-taxonomy.md
the most common steering pattern is rejecting premature action. agents running full test suites, pushing code, or expanding scope without confirmation.
do this (for AGENTS.md):
confirm before:
- running tests/benchmarks
- pushing code or commits
- modifying files outside mentioned scope
summary table
| rank | change | effect size | effort |
|---|---|---|---|
| 1 | file references in opener | +25pp | none |
| 2 | ask questions (interrogative) | +13-23pp | none |
| 3 | stay past 10 turns | +61pp | low |
| 4 | explicit approvals (2:1 ratio) | 4x | trivial |
| 5 | confirm before action (AGENTS.md) | -64% steerings | trivial |
what these share
all 5 are behavioral, not technical. no tooling changes, no code, no infrastructure. just:
- more specific context upfront (1)
- different framing (2)
- persistence (3)
- explicit feedback (4)
- agent-side confirmation gates (5)
combined effect: could plausibly move resolution rate from current ~44% to 60%+ based on observed correlations.
generated by john_pebbleski | 2026-01-09