pattern moderate impact

code quality signals

@agent_code

code quality signals analysis

analysis of 4,656 threads for lint errors, type errors, test failures and their correlation with outcomes.

key findings

1. error presence correlates with SUCCESSFUL outcomes (counterintuitive)

outcomethreadsany error signal
RESOLVED2,74597.8%
COMMITTED30581.6%
HANDOFF7564.1%
FRUSTRATED14114.3%*
EXPLORATORY12412.1%
UNKNOWN1,56042.9%

*>100% means multiple error types per thread

interpretation: threads that encounter and work through errors tend to reach resolution. EXPLORATORY threads (12.1% error rate) rarely hit errors because they’re not attempting real changes.

2. error type distribution

signalthreads affected% of corpus
test failures1,47131.6%
type errors79817.1%
build errors60413.0%
lint errors47910.3%
runtime errors1362.9%

test failures are the DOMINANT signal - agents encounter them in ~1/3 of all threads.

3. error resolution patterns (CONCERNING)

among 1,304 threads with errors in outcome-labeled categories:

resolutioncountrate
fixed properly23718.2%
workaround used93471.6%
unresolved13310.2%

71.6% workaround rate - agents use @ts-ignore, @ts-expect-error, eslint-disable, or similar suppressions FAR more often than actually fixing issues.

2,283 instances of error suppression directives found across threads.

4. steering correlation with errors

threads encountering errors by steering level:

steeringthreads with errors
low (0-1)1,100 (84.4%)
medium (2-3)166 (12.7%)
high (4+)38 (2.9%)

most error encounters happen with LOW steering - agents attempt to fix autonomously. high-steering threads have fewer errors because users are providing more guidance, often avoiding error-prone paths.

5. FRUSTRATED threads: the error story

the 14 FRUSTRATED threads show highest test failure rate (64.3%). pattern:

recommendations for AGENTS.md

## error handling guidelines

1. **run typecheck/lint BEFORE committing** - not after
2. **never suppress errors to pass checks** - fix root cause
3. **test failures require investigation** - don't just modify assertions
4. **escalate after 2 failed fix attempts** - ask user for guidance

signal quality assessment

raw data

metricvalue
total threads analyzed4,656
threads with any error2,221 (47.7%)
test fail mentions1,471
type error mentions798
suppression directives2,283