Overview

LLMs have strict context limits (around 1M tokens max, often better under 200K) that haven’t improved much despite other advances. Subagents provide a way to handle larger tasks by dispatching fresh copies of the agent with new context windows, allowing complex work without burning through the main agent’s valuable context space.

The Breakdown

  • Context limits are a major bottleneck - LLMs max out around 1M tokens but work better under 200K, creating challenges for complex tasks
  • Subagents work like tool calls - the parent agent dispatches a fresh copy of itself with a new context window and specific prompt to achieve a sub-goal
  • Claude Code uses an Explore subagent pattern to analyze codebases before making changes, avoiding context pollution in the main agent
  • Models can effectively prompt themselves - subagents receive focused, well-crafted prompts from their parent agents to accomplish specific exploration or analysis tasks