01
Problem
Running multiple AI agents in parallel for complex, multi-week projects creates significant coordination challenges:
- Flat structures lead to conflicts, duplicated work, and agents stepping on each other
- Dynamic coordination through shared files with locking becomes a bottleneck - most agents spend time waiting rather than working
- Equal status agents become risk-averse, avoiding difficult tasks and making only small, safe changes instead of tackling end-to-end implementation
- No agent takes ownership of hard problems or overall project direction
02
Solution
Separate agent roles into a hierarchical planner-worker structure:
- Planners: Continuously explore the codebase and create tasks. They can spawn sub-planners for specific areas, making planning itself parallel and recursive.
- Workers: Pick up tasks and focus entirely on completing them. They don't coordinate with other workers or worry about the big picture. They grind on their assigned task until done, then push changes.
- Judge: At the end of each cycle, determines whether to continue or if the goal is achieved.
This creates an iterative cycle where each iteration starts fresh, combating drift and tunnel vision.
03
How to use it
Use cases for planner-worker separation:
- Massive codebases: Projects that would take human teams months (1M+ lines of code, 1000+ files)
- Ambitious goals: Building complex systems from scratch (web browser, Windows emulator, Excel clone)
- Large-scale migrations: In-place framework migrations (Solid to React, Java LSP implementation)
- Performance optimization: Complete rewrites in different languages for speed (C++ to Rust)
Implementation considerations:
- Model choice per role: Different models excel at different roles. Use planning-focused models for planners even if coding-focused models exist for workers.
- Fresh starts: Each cycle should start fresh to combat drift and tunnel vision from long-running contexts.
- Parallel planning: Planners can spawn sub-planners, making the planning process itself parallel and recursive.
- Worker isolation: Workers should be task-focused and not worry about coordination with other workers.
Prompting is critical: Getting agents to coordinate well, avoid pathological behaviors, and maintain focus over long periods requires extensive experimentation with prompts.
04
Trade-offs
Pros:
- Scalability: Hundreds of agents can work concurrently on a single codebase for weeks
- Clear ownership: Planners own the big picture; workers own task completion
- Parallel planning: Planning itself scales through sub-planner spawning
- Reduced coordination overhead: Workers don't need to coordinate with each other
- Combats tunnel vision: Iterative cycles with fresh starts prevent drift
Cons:
- System complexity: Requires orchestration infrastructure for role separation and task distribution
- Prompt engineering difficulty: Coordination behavior requires extensive prompt experimentation
- Cost: Running hundreds of concurrent agents for weeks is expensive
- Not perfectly efficient: Significant token waste, but far more effective than expected
- Still evolving: Planners should wake up when tasks complete; agents sometimes run too long
06
References
- Scaling long-running autonomous coding - Cursor blog post on running hundreds of concurrent agents for weeks at a time
- Browser source code on GitHub - 1M+ lines of agent-generated code
- Feudal Networks (FuN) - ICML 2017 paper introducing manager-worker separation in hierarchical RL (Vezhnevets et al.)
- The Options Framework - Seminal work on temporal abstraction creating planning-execution hierarchy (Sutton et al., 1999)
- HIRO: Hierarchical RL with Off-Policy Correction - ICML 2020 paper on high-level planners and low-level workers (Lee et al.)