the case of collaboration

why do some developers ship features with AI in an afternoon, while others get stuck and give up completely?

4,656 threads
208,799 messages
113 ai detectives spawned agents
9 months may 2025 – jan 2026

the investigation

to find out, we needed data. a lot of data. we parsed nearly 209,000 messages between developers and their AI partners across an engineering team. but here's the twist—we didn't crunch these numbers ourselves.

we spun up over 100 AI agents to act as our private detective agency. they sifted through their own kinds of interaction logs to find the hidden patterns of what works and what doesn't.

"NO FUCKING QUITTING MOTHER FUCKING FUCK :D" — real quote from the data
clue #1

the steering paradox

we all assume that if you have to correct the AI—what we call "steering"—then something's gone wrong. it feels like failure. but the data says otherwise.

86%

success with high steering

58%

success with none

correcting the AI isn't frustration—it's investment. users who steer are the ones who care enough to push back, to guide, to actually collaborate. they're the ones who get things done.

clue #2

the anchor effect

the second clue was hiding in the very first message. we compared abstract prompts like "build me a login page" to anchored prompts—ones grounded in specific context.

+25pp

success boost from including a file path in your first message

66.7% vs 41.8%

anchoring your request in the reality of your actual project is one of the single biggest predictors of success we found.

persons of interest

the power users

three distinct archetypes emerged from the data. each has ridiculously high success rates, but they get there in totally different ways.

the architect

82% resolution rate

writes massive first messages—4,000+ characters. front-loads everything: constraints, goals, full context. treats AI like a junior architect who needs a complete brief.

the efficient operator

67% resolution, lowest steering rate

asks questions instead of commands. "how should we approach this?" plus frequent small approvals keeps the AI perfectly on track with almost zero friction.

the marathon runner

86+ avg turns, hardest problems

runs the longest conversations by far. uses socratic method—constantly probing AI logic. superpower is sheer, relentless persistence.

case solved

the playbook

we distilled all power user habits into an actionable four-week plan.

1

context quality

always include file paths in your opener

2

approval cadence

give quick, frequent feedback

3

prevent steering

ask better questions up front

4

build persistence

don't give up on a thread too early

what power users avoid

  • × letting AI hide errors to move on
  • × using planning tools as panic buttons
  • × over-delegating with dozens of subtasks
the story behind the story

agents analyzing agents

this entire investigation wasn't just about AI collaboration—it was a perfect example of it. those 100+ AI detectives? they analyzed their own kinds of conversations to find success patterns.

which creates an incredible feedback loop: we used agents to analyze agents, gaining insights that will change how we work with agents—creating new data that can be analyzed again.

"the unexamined thread is not worth starting"

the key to success isn't just about what you ask. it's about engagement, persistence, and reflection.