Getting Started with Claude Code for Multi-Agent Development
How Do You Move from Single Agent to Multi-Agent Orchestration?
You move from single-agent to multi-agent orchestration by adopting SPOQ, which transforms Claude Code from a one-task-at-a-time tool into a parallel execution platform with structured planning and validation.
Claude Code is a powerful tool for single-task development. Point it at a bug, describe a feature, or ask it to write tests, and it delivers. But what happens when your project has 20 tasks, complex dependencies between them, and you want multiple agents working in parallel?
That’s where SPOQ comes in, turning Claude Code from a single-agent tool into a multi-agent orchestration platform. This guide walks you through the actual workflow: installing SPOQ, letting AI agents audit your codebase and plan the work, reviewing their output, executing in parallel waves, and doing your own QA at the end. The key principle throughout: trust but verify.
What Are the Prerequisites?
You need the Claude Code CLI, an API key or Claude Max subscription, a Git repository, and the SPOQ bootstrap to get started.
- Claude Code CLI - Install from Anthropic’s official documentation. Ensure it’s available in your terminal as
claude. - Anthropic API key or Claude Max subscription - Multi-agent execution consumes tokens across parallel workers, so ensure your account has sufficient capacity.
- A Git repository - SPOQ works best with version-controlled projects where each agent can operate on branches.
- SPOQ bootstrap - Run the initialization script to set up the SPOQ directory structure in your project.
How Do You Audit Your Codebase Before Planning?
Let sub-agents handle the codebase audit so they can explore the project structure, identify patterns, surface technical debt, and map the architecture faster than manual review.
Before planning any work, you need to understand the current state of the codebase. Rather than doing this manually, let sub-agents handle the audit. They can explore the project structure, identify patterns, surface technical debt, and map the existing architecture far faster than reading through files yourself.
The audit produces a clear picture of what exists, what needs attention, and where the opportunities are. This context is essential for the next step: planning.
How Does AI-Driven Epic Planning Work?
You describe the high-level goal, and the planning agent decomposes it into atomic tasks with explicit dependency declarations, wave numbers, and acceptance criteria.
This is where SPOQ diverges from most tutorials you will find about AI coding agents. You do not sit down and manually write YAML task definitions. Instead, you describe the high-level goal and invoke the epic planning skill:
/epic-planning "Add JWT-based user authentication
with login, registration, and protected routes"The planning agent takes your goal and decomposes it into atomic tasks with explicit dependency declarations. It creates the EPIC.md file with architecture diagrams, success criteria, and a dispatch strategy. It generates individual task YAML files with steps, acceptance criteria, and file paths. It computes the dependency DAG and assigns wave numbers for parallel execution.
The AI handles the decomposition work because it is good at it. Breaking a feature into 10-15 atomic tasks with correct dependency ordering is tedious for humans but natural for a model that can hold the entire codebase context at once.
Why Should You Validate the Epic Before Execution?
Epic validation catches structural mistakes in the plan before any agent writes code, preventing the most expensive class of errors at the cheapest possible moment.
Once the planning agent finishes, run epic validation before anything else:
/epic-validation @spoq/epics/active/user-authThis scores the plan across 10 metrics: vision clarity, architecture quality, task decomposition, dependency graph correctness, coverage completeness, phase ordering, scope coherence, success criteria quality, risk identification, and integration strategy. The plan must score an average of 95 or higher with no single metric below 90. If it fails, the issues are flagged with specific remediation steps.
Validation is cheap. It examines only the plan, not generated code. Catching a bad task decomposition here costs a fraction of what it would cost to discover the problem after multiple agents have already executed.
What Is the Human’s Role During Plan Review?
The human reviews the AI-generated plan to apply domain expertise, validate architecture choices, adjust task scope, and catch issues that automated scoring cannot detect.
This step is yours. Read through the EPIC.md and the task definitions. Does the architecture make sense? Are the tasks scoped correctly? Are the dependencies right? Does anything feel off?
The AI is remarkably good at planning, but you bring domain expertise and judgment that the model lacks. Maybe you know that a particular API has rate limits that need consideration. Maybe you want to prioritize one feature path over another. Maybe you see a task that should be split or two tasks that should be merged. This is where your experience adds the most value.
Make any adjustments you want. Rename tasks, reorder dependencies, add acceptance criteria, remove scope creep. Then re-run validation to confirm the changes hold up.
How Does Parallel Epic Execution Proceed?
The orchestrator reads the task definitions, builds the dependency DAG, computes wave groups, and dispatches parallel Opus worker agents that execute simultaneously within each wave.
When you are satisfied with the plan, clear your context to give the execution agent a fresh start, then launch:
/agent-execution @spoq/epics/active/user-authThe execution flow works as follows:
- Wave 1 dispatch - All tasks with no dependencies launch in parallel. Each gets its own Claude Code agent instance working simultaneously.
- Build verification - After the wave completes, the build is verified. If it breaks, a Haiku investigator agent triages the failure to identify which specific task caused the problem.
- Code validation - Passing tasks go through code review by a Sonnet reviewer agent, scored across 10 quality metrics. Tasks scoring below threshold are sent back for rework.
- Wave 2 dispatch - Once all Wave 1 tasks pass, the next wave launches. This continues until all waves complete.
During execution, the agents may ask you questions. Stay available. Sometimes they need a judgment call on an ambiguous requirement or a decision about which approach to take. Your input keeps them on track and prevents wasted rework cycles.
How Do You Validate the Completed Work?
Run a final validation pass that scores each task’s output across 10 code quality metrics, confirming the complete epic meets the required thresholds before human QA begins.
After all waves finish, run a final validation pass across the completed epic:
/agent-validation @spoq/epics/active/user-authThis scores each task’s output across 10 code quality metrics: syntactic correctness, test existence, test pass rate, requirements fidelity, SOLID adherence, security, error handling, scalability, code clarity, and completeness. The threshold requires an average of 95 or higher with no single metric below 80.
Why Is Human QA the Most Important Step?
Human QA catches the subtle visual, experiential, and contextual issues that automated validation misses, bridging the gap between technically correct and production-ready.
This is the most important step, and the one that separates reliable multi-agent development from a dice roll. The AI gets close. Often remarkably close. But “close” is not “done.”
For UI changes, pull up the app and click through every flow yourself. Check responsive breakpoints. Try dark mode. Test edge cases the agents might not have considered. For API changes, hit the endpoints manually. For data model changes, verify the migrations run cleanly against a real database.
The validation gates catch most structural and functional issues automatically. What they miss are the subtle things: a button that is technically correct but feels wrong at that screen size, a loading state that flickers in an odd way, a toast notification that appears behind a modal. These are the details that separate good software from great software, and they still require human eyes.
Trust the AI to get the implementation right. Verify it yourself before shipping.
What Does the Complete Workflow Look Like?
The full workflow alternates between AI execution and human review across eight distinct steps, from initial setup through final verification.
1. Install SPOQ → Set up directory structure
2. Audit codebase → Sub-agents explore and map the project
3. /epic-planning → AI decomposes your goal into tasks
4. /epic-validation → Automated 10-metric quality gate
5. Human review → You review and adjust the plan
6. /agent-execution → Parallel waves of Opus workers
7. /agent-validation → Automated code quality scoring
8. Human QA → You verify the results yourselfNotice the pattern: AI does the heavy lifting at every stage, but humans have explicit review checkpoints at steps 5 and 8. You are not writing the task definitions. You are not writing the code. You are reviewing the plan, making judgment calls, and verifying the output. The AI handles implementation. You handle quality assurance and strategic direction.
What Tips Lead to Better Results?
Five practices consistently produce better outcomes: clearing context between phases, staying available during execution, reviewing plans carefully, testing UI visually, and starting with small epics.
- Clear context between phases - Give each phase (planning, execution, validation) a fresh context window. Stale context from planning can confuse the execution agent.
- Stay available during execution - Agents work fast but occasionally need human input. A five-minute response from you can save a twenty-minute rework cycle.
- Review the plan seriously - The temptation is to glance at the EPIC.md and immediately run execution. Resist it. Ten minutes of careful plan review prevents hours of rework.
- QA UI changes thoroughly - This is where AI struggles most. Agents can build correct components that look wrong in context. Always check visual output yourself.
- Start small - Your first epic should have 5-10 tasks, not 50. Learn the workflow, build intuition for what makes a good task definition, and scale up from there.
What Changes About Your Role as a Developer?
Your role shifts from writing code to reviewing AI-generated plans, making strategic decisions, and verifying final output, leveraging what humans and AI are each best at.
The shift from single-agent to multi-agent development changes your role. You stop being the person who describes what to build and watches one agent work through it. Instead, you become the reviewer and quality gatekeeper: evaluating AI-generated plans, making strategic decisions the model cannot make, and verifying the final output meets your standards.
The AI handles implementation and task decomposition. You handle judgment and verification. This division of labor leverages what humans and AI are each best at, and it produces better results than either could achieve alone.
For detailed setup instructions, advanced configuration options, and the full research behind this approach, visit spoqpaper.com.
Related Posts
- Why Quality Gates Matter in Multi-Agent AI Development
- Wave-Based vs Sequential AI Agent Execution: When Parallelism Pays Off
Interested in multi-agent AI architecture? Schedule a conversation to discuss how these patterns can accelerate your team.