plan vs execute: thread opening patterns
summary
analyzed 3488 threads for whether they start with planning/discussion or jump straight to execution.
| approach | threads | success rate | stuck/frustrated | avg steering |
|---|---|---|---|---|
| planning first | 578 | 57% | 0% | 0.3 |
| execution first | 2552 | 55% | 0% | 0.4 |
| mixed | 5 | 60% | 0% | 0.8 |
| ambiguous | 20 | 70% | 0% | 0.3 |
key findings
execution-first threads
- 2552 threads (73% of corpus)
- 55% success rate (resolved or committed)
- avg 0.4 steering corrections per thread
- detected by: imperative verbs, file references, continuation markers
planning-first threads
- 578 threads (17% of corpus)
- 57% success rate
- avg 0.3 steering corrections per thread
- detected by: “how should we”, “what’s the best approach”, multiple questions
interpretation
planning-first threads show higher success (57% vs 55%). thinking before doing pays off.
execution threads require more steering (0.4 vs 0.3 corrections). jumping to code without discussion causes rework.
hunch
the data contradicts the hypothesis that clear, imperative instructions outperform exploratory planning requests. users who start with “implement X” rather than “how should we approach X?” may have already done their planning internally.
caveat: planning threads may tackle harder problems by nature. success rates don’t account for task complexity.
examples
execution → success
T-00298580-4ecf-4207-8415-e38e06ae1a24
Continuing work from thread T-de7b065a-b5da-46fa-bf1f-b639c41b514d. When you lack specific information you can use read_thread to get it. @lib/…
T-00a4727e-6b80-47e4-b1c1-f494e30290ef
please look at the way we’re preventing type errors in @lib/ml/test/evals/scorer.types.test.ts by doing stuff like
input;(so that it doesn’t g…
T-019afee0-7141-747f-a5b9-95f000594c4b
Continuing work from thread T-68ca0c69-e390-4f75-ae85-d4dfb6f311dc. When you lack specific information you can use read_thread to get it. @app/dash…
planning → success
T-019b044a-118c-779a-a211-85dc77f84b94
How does this work? Do they reorganize the data in the background to make it efficiently to query? Particularly for time ordered data that is important…
T-019b04a0-a3c3-70dd-94e0-01732f888583
Continuing work from thread T-cc84bf6c-8681-4c77-ab19-702a2d0735ea. When you lack specific information you can use read_thread to get it. @company/j…
T-019b04a7-87af-70b3-b117-ad74c9707e2f
I was chatting with a developer from amp and they told me they have a similar workflow to something I want, they sent this gist: https://gist.github.c…
execution → stuck
T-019b03ba-82d0-741e-98a5-79d97d0147fe (2 steering corrections)
Fix this…
T-019b2dd2-3ee3-7380-8c53-6aab902e5931 (1 steering corrections)
Continuing work from thread T-019b2d94-b208-754d-9477-6bc3b7793f07. When you lack specific information you can use read_thread to get it. @lib/c…
planning → stuck
T-019b46b8-544a-7185-a78c-2792f7d1cbef (3 steering corrections)
Continuing work from thread T-019b4689-d2c8-708c-bc26-793932517adc. When you lack specific information you can use read_thread to get it. @docs/desig…
T-019b88a4-5dc7-7079-a2c7-a68d5d8a33c1 (1 steering corrections)
following: @T-019b8851-c22a-77ef-84a6-e1f9dba67336 please look at the below output of the e2e job 2026-01-04T10:43:56.7169779Z Current runner versi…
methodology
- planning signals: “how should we”, “what’s the best approach”, “plan”, “design”, multiple questions
- execution signals: imperative verbs at start, file references, “implement”, “fix”, “add”
- success: RESOLVED or COMMITTED status
- stuck: STUCK or FRUSTRATED status
threads with ambiguous or mixed signals categorized separately.