Overview
LLMs have strict context limits (around 1M tokens max, often better under 200K) that haven’t improved much despite other advances. Subagents provide a way to handle larger tasks by dispatching fresh copies of the agent with new context windows, allowing complex work without burning through the main agent’s valuable context space.
The Breakdown
- Context limits are a major bottleneck - LLMs max out around 1M tokens but work better under 200K, creating challenges for complex tasks
- Subagents work like tool calls - the parent agent dispatches a fresh copy of itself with a new context window and specific prompt to achieve a sub-goal
- Claude Code uses an Explore subagent pattern to analyze codebases before making changes, avoiding context pollution in the main agent
- Models can effectively prompt themselves - subagents receive focused, well-crafted prompts from their parent agents to accomplish specific exploration or analysis tasks