Problem
Large multi-file tasks blow out the main agent's context window and reasoning budget. You need a way to delegate work to specialized agents with isolated contexts and tools.
Solution
Let the main agent spawn focused sub-agents, each with its own fresh context, to work in parallel on shardable subtasks. Aggregate their results when done.
Critical requirement: Each subagent invocation must have a clear, specific task subject for traceability. Empty or generic subjects make parallel work untraceable and synthesis difficult. See Subject Hygiene for details.
Implementation approaches:
How to use it
Use cases for subagents:
-
Context window management: Process large files in subagents without polluting main context
- Upload files to subagent
- Extract specific data
- Return summary to main agent
-
Concurrent work: Run multiple subagents in parallel, join on completion
- Reduces clock-time for I/O-bound workflows
- Network API calls can happen simultaneously
-
Code-driven LLM invocation: Hand off control to LLM for specific determination
- Code workflow calls subagent
- Subagent makes LLM-powered decision
- Control returns to code with result
-
Security isolation: Separate tools/contexts in mutually isolated subagents
- External resource retrieval isolated from internal access
- Reduced blast radius for sensitive operations
Declarative subagent setup:
# agents.yaml
subagents:
planning:
file: subagents/planning.yaml
allowed_in:
- main_agent
- research_agent
think:
file: subagents/think.yaml
allowed_in:
- main_agent
Virtual file passing:
# Main agent
result = subagent(
agent_name="planning",
prompt="Analyze these files and create migration plan",
files=["file1.ts", "file2.ts", "file3.ts"]
)
# Only these 3 files visible to planning subagent
Recursive architecture insight:
Some implementations treat every agent as a subagent, enabling flexible composition and consistent behavior across the system.
Trade-offs
Pros:
- Context isolation: Each subagent has clean context window
- Parallelization: Reduce workflow latency through concurrent execution
- Specialization: Different subagent types for different tasks (planning, thinking, analysis)
- Virtual files: Precise control over what each subagent can see
- Tool scoping: Limit subagent capabilities for security/simplicity
- Declarative config: Reusable subagent definitions via YAML
Cons:
- Overhead: Spawning and coordinating subagents adds complexity
- Cost: Running multiple agents simultaneously increases token usage
- Coordination: Main agent must track and aggregate subagent results
- Not always necessary: Author notes "frequently thought we needed subagents, then found more natural alternative"
- Latency visibility: User-facing latency is "invisible feature" until it becomes problematic
When subagents matter most:
- Context window management (large file processing)
- I/O-bound workflows (network API calls)
- Code-driven workflows needing LLM delegation
- Massive parallelization needs (10+ concurrent agents)
References
- SKILLS-AGENTIC-LESSONS.md - Analysis of 88 sessions emphasizing clear task subjects and parallel delegation patterns
- Vezhnevets, A., et al. (2017). Feudal Networks for Hierarchical Reinforcement Learning. ICML. - Manager-worker separation with goal-setting in latent space
- Raising An Agent - Episode 6: Claude 4 Sonnet edits 36 blog posts via four sub-agents.
- Boris Cherny (Anthropic) on swarm migrations for framework changes and lint rules
- AI & I Podcast: How to Use Claude Code Like the People Who Built It
- Cognition AI: Devin & Claude Sonnet 4.5 - discusses how improved model judgment about state externalization may make subagent delegation more practical
- Building Companies with Claude Code - Ambral's "robust research engine" uses dedicated sub-agents specialized for different data types, enabling parallel research across system areas
- Building an internal agent: Subagent support - Will Larson on YAML-configured subagents with virtual file isolation and code-driven LLM invocation
- Cursor: Scaling long-running autonomous coding - Hierarchical spawning with hundreds of concurrent agents validated in production