Sub-Agent Spawning

Problem

Large multi-file tasks blow out the main agent's context window and reasoning budget. You need a way to delegate work to specialized agents with isolated contexts and tools.

Solution

Let the main agent spawn focused sub-agents, each with its own fresh context, to work in parallel on shardable subtasks. Aggregate their results when done.

Critical requirement: Each subagent invocation must have a clear, specific task subject for traceability. Empty or generic subjects make parallel work untraceable and synthesis difficult. See Subject Hygiene for details.

Implementation approaches:

How to use it

Use cases for subagents:

Context window management: Process large files in subagents without polluting main context
- Upload files to subagent
- Extract specific data
- Return summary to main agent
Concurrent work: Run multiple subagents in parallel, join on completion
- Reduces clock-time for I/O-bound workflows
- Network API calls can happen simultaneously
Code-driven LLM invocation: Hand off control to LLM for specific determination
- Code workflow calls subagent
- Subagent makes LLM-powered decision
- Control returns to code with result
Security isolation: Separate tools/contexts in mutually isolated subagents
- External resource retrieval isolated from internal access
- Reduced blast radius for sensitive operations

Declarative subagent setup:

# agents.yaml
subagents:
  planning:
    file: subagents/planning.yaml
    allowed_in:
      - main_agent
      - research_agent

  think:
    file: subagents/think.yaml
    allowed_in:
      - main_agent

Virtual file passing:

# Main agent
result = subagent(
    agent_name="planning",
    prompt="Analyze these files and create migration plan",
    files=["file1.ts", "file2.ts", "file3.ts"]
)
# Only these 3 files visible to planning subagent

Recursive architecture insight:

Some implementations treat every agent as a subagent, enabling flexible composition and consistent behavior across the system.

Trade-offs

Pros:

Context isolation: Each subagent has clean context window
Parallelization: Reduce workflow latency through concurrent execution
Specialization: Different subagent types for different tasks (planning, thinking, analysis)
Virtual files: Precise control over what each subagent can see
Tool scoping: Limit subagent capabilities for security/simplicity
Declarative config: Reusable subagent definitions via YAML

Cons:

Overhead: Spawning and coordinating subagents adds complexity
Cost: Running multiple agents simultaneously increases token usage
Coordination: Main agent must track and aggregate subagent results
Not always necessary: Author notes "frequently thought we needed subagents, then found more natural alternative"
Latency visibility: User-facing latency is "invisible feature" until it becomes problematic

When subagents matter most:

Context window management (large file processing)
I/O-bound workflows (network API calls)
Code-driven workflows needing LLM delegation
Massive parallelization needs (10+ concurrent agents)

References

SKILLS-AGENTIC-LESSONS.md - Analysis of 88 sessions emphasizing clear task subjects and parallel delegation patterns
Vezhnevets, A., et al. (2017). Feudal Networks for Hierarchical Reinforcement Learning. ICML. - Manager-worker separation with goal-setting in latent space
Raising An Agent - Episode 6: Claude 4 Sonnet edits 36 blog posts via four sub-agents.
Boris Cherny (Anthropic) on swarm migrations for framework changes and lint rules
AI & I Podcast: How to Use Claude Code Like the People Who Built It
Cognition AI: Devin & Claude Sonnet 4.5 - discusses how improved model judgment about state externalization may make subagent delegation more practical
Building Companies with Claude Code - Ambral's "robust research engine" uses dedicated sub-agents specialized for different data types, enabling parallel research across system areas
Building an internal agent: Subagent support - Will Larson on YAML-configured subagents with virtual file isolation and code-driven LLM invocation
Cursor: Scaling long-running autonomous coding - Hierarchical spawning with hundreds of concurrent agents validated in production

Source