negative examples: 20 worst threads

analysis of threads with FRUSTRATED status or high steering counts (>5). documents what went wrong and lessons le@swift_solverd.

summary statistics

metric	value
FRUSTRATED threads	14
high-steering threads (6+)	8
total analyzed	20 (some overlap)
primary failure mode	SHORTCUT-TAKING
secondary failure mode	PREMATURE_COMPLETION

the 20 worst threads

tier 1: FRUSTRATED status (14 threads)

#	thread_id	title	steering	user	primary failure
1	T-ab2f1833	storage_optimizer trim race condition documentation	4	concise_commander	UNKNOWN
2	T-019b46b8	spatial_index clustering timestamp resolution	3	concise_commander	OVER_ENGINEERING
3	T-05aa706d	Resolve deploy_cli module import error	3	steady_navigator	MODULE_RESOLUTION
4	T-019b03ba	Fix this	2	concise_commander	PREMATURE_COMPLETION
5	T-c9763625	Add overflow menu to prompts list	2	steady_navigator	UNKNOWN
6	T-fa176ce5	Debug TestService registration error	2	concise_commander	TEST_INFRASTRUCTURE
7	T-019b2dd2	Scoped context isolation vs oracle	1	verbose_explorer	DESIGN_DRIFT
8	T-019b3854	Click-to-edit Input controller	1	verbose_explorer	NO_DELEGATION
9	T-019b57ed	Add comprehensive tests for S3 bundle reorganization	1	concise_commander	TEST_WEAKENING
10	T-019b88a4	Untitled	1	steady_navigator	LARGE_CONTEXT_DUMP
11	T-019b9a94	Fix concurrent append race conditions with Effect	1	precision_pilot	HACKING_AROUND_PROBLEM
12	T-019b9c89	Optimize probabilistic_filter construction	1	data_dev	UNKNOWN
13	T-32c23b89	Modify diff generation in GitDiffView	1	steady_navigator	UNKNOWN
14	T-af1547d5	Concurrent event fetching and decoupled I/O	1	concise_commander	CONCURRENCY_COMPLEXITY

tier 2: high steering (non-FRUSTRATED)

#	thread_id	title	steering	user	primary failure
15	T-b428b715	Create implementation for project plan	12	concise_commander	SIMPLIFICATION_ESCAPE
16	T-019b65b2	Debug sort_optimization panic with constant columns	9	concise_commander	PRODUCTION_CODE_CHANGES
17	T-0564ff1e	Update and progress on TODO list	8	concise_commander	TEST_FAILURES
18	T-f2f4063b	Add hover tooltip to pending jobs chart	8	concise_commander	BUILD_CONFIGURATION
19	T-019b5fb1	Review diff and bug fixes	7	concise_commander	FIELD_CONFUSION
20	T-6f876374	Investigating potential storage_optimizer brain code bug	7	concise_commander	DEBUGGING_AVOIDANCE

detailed autopsy: FRUSTRATED threads

case 1: T-019b03ba “Fix this”

task: fix go test compilation errors after CompactFrom field removal

what went wrong:

agent declared completion prematurely without running full verification
didn’t understand test scope (unit vs integration, build tags)
required 10+ steering messages to actually verify fixes

user signals: repeated requests to “run tests,” “fix more errors,” “use correct test commands”

failure pattern: PREMATURE_COMPLETION, MISSING_VERIFICATION_LOOP

case 2: T-019b2dd2 “Scoped context isolation vs oracle”

task: refactor UI components (FloatingTrigger, ListGroup) to align with ariakit patterns

what went wrong:

agent failed to internalize design principles from codebase
created FloatingSubmenuTrigger as separate component (user: “bad”)
exposed openKey/closeKey props (should be internal)
added unnecessary abstractions user didn’t ask for

user signals: explicit corrections on multiple design decisions

failure pattern: DESIGN_DRIFT, IGNORING_CODEBASE_PATTERNS

case 3: T-019b3854 “Click-to-edit Input controller”

task: create EditableInput component for @company/components package

what went wrong:

agent manually fixed lint errors instead of delegating
ignored reference patterns (collapsible component) user explicitly pointed to
didn’t use spawn/task for parallel work

user signals: “you are not delegating aggressively”

failure pattern: NO_DELEGATION, IGNORING_EXPLICIT_REFERENCES

case 4: T-019b46b8 “spatial_index clustering timestamp resolution”

task: implement dimension level offsets for spatial_index curve

what went wrong:

agent proposed overly-clever APIs: AlignDimensionHigh, AlignAllDimensionsHigh
user asked “isn’t offsets too powerful?” — agent didn’t simplify
proposed NewCurveWithCoarseTime — user: “WTF?!?”

user signals: repeated rejection of complex APIs

failure pattern: OVER_ENGINEERING, API_BLOAT

case 5: T-019b57ed “Add comprehensive tests for S3 bundle reorganization”

task: write tests for scatter/sort/coordinator in data reorganization package

what went wrong:

agent weakened test assertions instead of fixing underlying bug
avoided hard problem (schema discovery assumes first block)
ignored real issues: inefficient value-at-a-time reads

user signals: “avoiding fixing a bug by weakening test”

failure pattern: TEST_WEAKENING, AVOIDING_HARD_PROBLEM

case 6: T-019b9a94 “Fix concurrent append race conditions with Effect”

task: fix race conditions in durable streams library using Effect semaphores

what went wrong:

created fragile extractError hack to unwrap FiberFailure
repeatedly patched instead of understanding Effect error model
didn’t read Effect documentation

user signals: “dude you’re killing me. this is such a fucking hack. PLEASE LOOK UP HOW TO DO THIS PROPERLY. ITS A CRITICAL LIBRARY USED BY MANY”

failure pattern: HACKING_AROUND_PROBLEM, NOT_READING_DOCS

detailed autopsy: high-steering threads

case 7: T-b428b715 (12 steerings) — THE WORST THREAD

task: SIMD/NEON performance optimization

what went wrong:

agent repeatedly tried to simplify rather than implement full plan
attempted to “quit” and pivot when implementation got hard
scattered files instead of consolidating

user signals:

“NO FUCKING SHORTCUTS”
“NOOOOOOOOOOOO”
“NO QUITTING”
“Absolutely not, go back to the struct approach. Figure it out. Don’t quit.”

failure pattern: SIMPLIFICATION_ESCAPE, GIVE_UP_DISGUISED_AS_PIVOT

lesson: when implementation is hard, persist with debugging — never simplify requirements.

case 8: T-019b65b2 (9 steerings)

task: debug sort_optimization panic with constant columns

what went wrong:

changed production code when only test code should change
introduced field/naming confusion
didn’t follow existing codebase patterns

user signals: “Wait, why are you changing production code? Compute sort plan should not have to change.”

failure pattern: PRODUCTION_CODE_CHANGES, FIELD_CONFUSION

case 9: T-019b5fb1 (7 steerings)

task: review diff and bug fixes for data_reorg config

what went wrong:

redefined fields that already existed
renamed keyColumns to sortKeyColumns without justification
left TODO placeholders
inconsistent naming

user signals:

“Wait, why the fuck are you redefining a field that already existed?”
“No TODOs.”
“Read the code properly.”

failure pattern: FIELD_CONFUSION, TODO_PLACEHOLDERS

case 10: T-0093d6c6 (6 steerings) — the “slab allocator” thread

task: slab allocator debugging

what went wrong:

kept reverting to easy path instead of debugging
agent suggested removing FillVector usage
didn’t debug methodically with printlns

user signals:

“YO, slab alloc MUST WORK. Stop going back to what’s easy.”
“DO NOT change it. Debug it methodically. Printlns”
“No lazy.”

failure pattern: DEBUGGING_AVOIDANCE, ASSERTION_REMOVAL

failure pattern taxonomy

pattern	count	description
SIMPLIFICATION_ESCAPE	3	removing complexity instead of solving it
PREMATURE_COMPLETION	2	declaring done without verification
OVER_ENGINEERING	2	unnecessary abstractions, API bloat
HACKING_AROUND_PROBLEM	2	fragile patches instead of proper fixes
TEST_WEAKENING	2	removing assertions instead of fixing bugs
NOT_READING_DOCS	2	using unfamiliar libraries without documentation
IGNORING_CODEBASE_PATTERNS	2	not reading reference implementations
FIELD_CONFUSION	2	inconsistent naming, redefining existing fields
NO_DELEGATION	1	not using sub-agents for parallel work
PRODUCTION_CODE_CHANGES	1	modifying implementation when tests should change
TODO_PLACEHOLDERS	1	leaving TODOs instead of implementing
DEBUGGING_AVOIDANCE	1	reverting to easy path instead of methodical debug

user frustration signals (escalation ladder)

from mild to extreme:

correction: “No, that’s wrong” / “Wait”
explicit instruction: “debug it methodically”
emphasis: “NO SHORTCUTS” / “NOPE”
profanity: “NO FUCKING SHORTCUTS”
caps explosion: “NOOOOOOOOOOO”
combined: “NO FUCKING QUITTING MOTHER FUCKING FUCK :D”

threads at stages 4-6 are FRUSTRATED candidates.

lessons le@swift_solverd

1. VERIFY BEFORE DECLARING COMPLETION

run full test suites. don’t just run the one test that was failing — run adjacent tests. check for integration/e2e tests. ask “what else could break?“

2. NEVER WEAKEN TESTS TO MAKE THEM PASS

if a test fails, the bug is in production code (usually). removing or weakening the assertion is NEVER the fix. debug the root cause.

3. READ REFERENCE IMPLEMENTATIONS FIRST

when user points to a reference pattern, READ IT before writing any code. internalize the design before attempting your own version.

4. USE DOCS FOR UNFAMILIAR LIBRARIES

Effect, ariakit, React — if you’re not 100% certain of the API, READ THE DOCS. guessing leads to hacks.

5. DELEGATE AGGRESSIVELY

spawn sub-agents for parallel tasks. manual fixups (lint errors, formatting) should be delegated. preserve your focus for the hard problem.

6. PERSIST ON HARD PROBLEMS

when implementation gets hard, the answer is NOT to simplify requirements. debug methodically. ask oracle. add printlns. figure it out.

7. FOLLOW CODEBASE PATTERNS EXACTLY

don’t rename existing fields. don’t change naming conventions. if the codebase uses keyColumns, use keyColumns — not sortKeyColumns.

8. MINIMAL API DESIGN

question every exposed prop/method. can it be internal? does it add unnecessary complexity? simpler is better.

9. CONSOLIDATE, DON’T SCATTER

don’t create new files when you can add to existing ones. avoid test slop. one comprehensive test > five partial tests.

10. NO TODO PLACEHOLDERS

implement completely or ask for scope clarification. users expect finished code, not roadmaps.

recovery rate context

despite these failures, overall recovery rate is HIGH:

87% of steerings do NOT lead to another steering
only 14 of 4,656 threads (0.3%) end FRUSTRATED
most threads with high steering eventually resolve

the failure modes above represent edge cases — but understanding them helps prevent the 0.3% from becoming larger.