Go back

Using Sub Agents to Save Context Window or Save Time

Oct 17, 2025

Discover context management in Claude Code with sub agents. Understand when to utilize parallel or sequential execution, avoid context bloat, and coordinate multi-agent workflows for complex development workflows.


Sub Agents

After spending countless hours with Claude Code, I've found that context management makes or breaks long development sessions. As your context gets filled up with debugging output, error traces, and implementation iterations, performance degrades. Sub agents help by creating isolated 200k token workspaces for a specific task, returning only what's relevant to your main conversation.

The difference is noticeable: parallel tasks complete much faster, and conversation remains productive far longer.

One of the significant points individuals miss is that you do not need to name or configure sub agents. Just say "do this using sub agents" or "delegate each file to a separate sub agent." Claude Code may launch sub agents on its own now and then, but don't rely on it. It is your job to be aware of your context and to expressly request sub agents when you feel it filling up. Every file you read, every error log, every loop stays in your window.

The Context Problem

A typical feature development accrues early requirements, then multiple implementation attempts, stack traces and debug logs, file reads from dozens of files, and exploratory code that wasn't shipped. By round 25, your context is mostly noise. Claude forgets about key requirements. You end up repeating conversations.

Here's what actually gets accumulated in a single conversation:

Message 1-5: Requirements discussion (2k tokens)
Message 6-10: First implementation attempt (8k tokens)
Message 11-15: Debugging, error logs (15k tokens)
Message 16-20: Reading 20 different files (40k tokens)
Message 21-25: Second implementation attempt (10k tokens)
Message 26-30: More debugging, stack traces (20k tokens)
Message 31-35: Third attempt, different approach (12k tokens)
by message 40: 150k+ tokens, mostly obsolete noise

Claude can't easily refer to those initial requirements anymore—they're buried under implementation rubble.

How Sub Agents Prevent Context Bloat

Sub agents work in their own 200k token contexts. They handle the messy parts like reading files, debugging, analyzing patterns—then return condensed summaries.

Consider debugging a production memory leak. Without sub agents, your context accumulates: Reading application logs from last 24 hours (30k tokens) Analyzing 3 heap dumps at different times (45k tokens) Checking 20 related source files (60k tokens) Git blame and recent commit history (15k tokens) Testing various hypotheses with debug output (25k tokens) Total pollution: 175k tokens forever in your context

With sub agents, you just say: "Use sub agents to investigate this memory leak"

Claude orchestrates the investigation: log-analyzer agent → Reads all logs, returns 2k token summary heap-investigator agent → Analyzes dumps, returns 3k findings code-archaeologist agent → Reviews changes, returns 1k suspect list fix-implementer agent → Tests and fixes, returns final code

Your main context: 6k tokens of pure insights instead of 175k of noise

Each agent penetrates deep in its own domain, passing on only what is required. Your main conversation stays at the decision level, not the data-processing level.

Sequential vs Parallel Execution

Whether to use sequential or parallel execution depends on whether tasks depend on each other.

Sequential Example: TDD Workflow

Your prompt: "Use sequential sub agents for the TDD workflow"

Claude structures:

1. spec-analyst → Writes detailed specifications
2. test-writer → Creates tests from specs (needs step 1 output)
3. implementer → Writes code to pass tests (requires step 2 output)
4. reviewer → Reviews implementation (requires step 3 output)

Time is the sum of all phases—you're trading speed for correct handoffs.

Parallel Example: Testing Multiple Files

Your prompt: "Write unit tests for these 10 modules using parallel sub agents"

Without sub agents (sequential):
module1.test.js (5 minutes) →
module2.test.js (5 minutes) →
module3.test.js (5 minutes) →
.continues one by one.
Total time: 50 minutes

With parallel sub agents:
module1.test.js ┐
module2.test.js ├─ All running in parallel
module3.test.js ├─
.all 10.   ┘
Total time: 5 minutes (just the slowest one)

Another Parallel Example: Codebase-wide Fixes

Your prompt: "Fix all ESLint errors using sub agents in parallel"

Claude spawns multiple agents, each of which runs a batch of files in parallel. What would have taken half an hour of sequential fixes is completed in minutes.

What I've Seen in Practice

Anthropic's research shows significant performance improvements with multi-agent systems. Context editing enormously reduces token usage in long conversations.

Let's consider a real scenario I encountered - porting a React component library to a new design system:

Without sub agents (all in main context):

Reading 30 component files to understand current patterns
Reading design system documentation
Attempting updates on Button component (with errors)
Debugging CSS-in-JS issues
Attempting updates on Card component (more errors)
Reading additional documentation
Fixing Button component again

and so forth for every component.

Result: Context gets blasted with half-ported code, error messages, documentation snippets. By component #10, Claude forgets the design system rules established early on."`

With sub agents:

Prompt: "Use parallel sub agents to port each component to the new design system"

Main context only sees:

  • Original design system rules (left in for reference)
  • Status reports from each agent
  • Final ported components

Each sub agent operates on one component in isolation, reading files, debugging, and iteration without polluting the main thread.

Where Sub Agents Help Most

For code quality at scale, sub agents excel.

Prompt: "Use parallel sub agents to fix linting errors across the codebase" Result: Each agent handles a sub-set of files in parallel, bypassing linear waiting.

Prompt: "Delegate code review to a sub agent with our team's standards"

Result: Independent reviewer consistently applies standards without polluting your main context with review details.

Test workflows benefit enormously.

Prompt: "Generate exhaustive tests for src/components using parallel sub agents"

Result: Each component gets its own sub agent generating tests in parallel.

Prompt: "Use sequential sub agents for test-driven development of the payment feature"

Result: Clean handoffs from spec to test to implementation to review.

Acceleration of development is another sweet spot.

Prompt: "Use parallel sub agents to scaffold the new user dashboard - one for API, one for React components, one for tests" Result: All three things develop in parallel from the same spec.

Prompt: "Investigate this production incident using parallel sub agents for each service's logs" Result: Agents reconstruct the timeline from different services in parallel, findings converge quickly.

The Token Trade-off

Sub agents use far more tokens than regular chat. That being said, you receive much more effective context capacity through parallelization, significant time savings for independent tasks, and better efficiency on tasks that generate numerous intermediate tokens. Tasks that would otherwise swamp your context or benefit from parallel execution are usually worth the cost.

What I've Learned

Be aware of your situation at all times. Every file read, error trace, and loop over code counts. When you hear these patterns, immediately request sub agents:

Warning signs you need sub agents:

  • "Let me read all the test files to understand the pattern" (about to read 20+ files)
  • "I'll check each component for this issue" (sequential checking = parallel opportunity)
  • Multiple debugging attempts with stack traces
  • "Examine logs for the past hour" (tons of data pouring in)
  • Any task that requires dealing with more than 5-10 files

Instructions need to be brief but specific:

  • "Try these 5 files in parallel in sub agents"
  • "Farm out the debugging to a sub agent and just report back the solution"
  • "Run each module through its own sub agent"

Claude understands without complex configuration.

Be proactive about delegation. While Claude Code defaults to using sub agents, don't take it for granted. If you will be reading many files or parsing huge logs, request delegation explicitly before starting.

Consider hard task dependencies carefully. Can tasks be executed simultaneously? Request parallel execution. Do outputs feed into each other? Go serial. The incorrect choice either incurs unnecessary wait time or coordination failures.

When Sub Agents Don't Help

Pure, straightforward tasks don't need the overhead. Tight debugging with constant context is more appropriate in your main thread. Exploratory tasks with no clear objectives need the latitude of direct interaction. When token budget is important, sub agents are expensive.

Getting Started

No special syntax is needed. The following are actual prompts I use:

For debugging:

  • "Analyze these error logs using a sub agent and return only the root cause"
  • "Sub agent, profile the memory, but just give me the bottlenecks"

For parallel work:

  • "Spawn parallel sub agents to insert TypeScript types into every file in src/utils"
  • "Employ sub agents to author integration tests for each API endpoint in parallel"

For sequential workflows:

  • " Employ sub agents to refactor the module of this chapter: analyze dependencies first, then refresh interfaces, then migrate implementations"
  • "Process the database migration with sub agents: current schema analyze, migration generate, changes validate"

Claude tends to know when to use sub agents, but don't assume it always will. Your role is monitoring what's pending in your context, requesting sub agents in advance for computationally heavy tasks, and requesting parallel execution explicitly when you see independent tasks.

Try it next time you need to read in a number of files or debug something complex. Just say "use a sub agent for this" and watch your main context stay clean while the heavy lifting is done elsewhere.

Sub agents cause Claude to move from a single assistant to an orchestra that you can direct. After you're accustomed to them, it's hard to go back to single-threaded sessions that slowly get bogged down under their own context.