quick wins: 5 highest-impact, lowest-effort changes

ranked by effect size × ease of implementation.

1. include file references in opening message

effect: +25pp success rate (66.7% vs 41.8%)
effort: zero — just type @path/to/file.ts
source: first-message-patterns.md

this is the single strongest predictor in the entire dataset. works because it anchors the agent to concrete code rather than abstract requirements.

do this: start threads with explicit file mentions. e.g., @src/auth/login.ts needs error handling for expired tokens

2. use interrogative style (ask questions)

effect: +13pp over contextual, +23pp over hybrid (69.3% vs 56.7% directive, 51.5% contextual, 42.2% hybrid)
effort: zero — reframe commands as questions
source: prompting-styles.md

counterintuitive: asking “how should we handle X?” outperforms commanding “fix X” for complex tasks. questions force the agent to reason before acting.

do this: for non-trivial tasks, lead with a question. how would you approach caching for this endpoint? before add caching.

3. stay past 10 turns

effect: +61pp success (75% at 26-50 turns vs 14% at <10 turns)
effort: low — just don’t abandon early
source: length-analysis.md, ULTIMATE-SYNTHESIS.md

threads < 10 turns fail 86% of the time. most are abandoned, not completed. the sweet spot is 26-50 turns (75% success).

do this: if a task matters, commit to at least 15-20 turns before deciding it’s not working.

4. approve explicitly (target 2:1 approval:steering)

effect: 4x success rate when ratio > 2:1 vs < 1:1
effort: trivial — type “good”, “ship it”, “yes”
source: conversation-dynamics.md, thread-flow.md

explicit approvals (“good”, “yes”, “ship it”) create checkpoints the agent can anchor to. without them, agents drift and require more steering.

do this: approve after each successful step. even “yup” counts. aim for 2 approvals per steering.

5. confirm before tests/benchmarks

effect: 47% of steerings are “no…”, 17% are “wait…” — most correct premature action
effort: trivial for agent behavior — add to AGENTS.md
source: steering-deep-dive.md, steering-taxonomy.md

the most common steering pattern is rejecting premature action. agents running full test suites, pushing code, or expanding scope without confirmation.

do this (for AGENTS.md):

confirm before:
- running tests/benchmarks
- pushing code or commits
- modifying files outside mentioned scope

summary table

rank	change	effect size	effort
1	file references in opener	+25pp	none
2	ask questions (interrogative)	+13-23pp	none
3	stay past 10 turns	+61pp	low
4	explicit approvals (2:1 ratio)	4x	trivial
5	confirm before action (AGENTS.md)	-64% steerings	trivial

all 5 are behavioral, not technical. no tooling changes, no code, no infrastructure. just:

more specific context upfront (1)
different framing (2)
persistence (3)
explicit feedback (4)
agent-side confirmation gates (5)

combined effect: could plausibly move resolution rate from current ~44% to 60%+ based on observed correlations.

generated by john_pebbleski | 2026-01-09