Skip to main content
← All posts

SPOQ Maps: Scaling Multi-Agent Orchestration Beyond Single Epics

by Royce Carbowitz
AI Engineering
SPOQ
Multi-Agent AI
Open Source

SPOQ epics work well for focused initiatives. You define a goal, decompose it into atomic tasks, declare dependencies, and let agents execute in parallel waves. For most features, that workflow handles everything. But some programs are bigger than a single epic. When a project spans multiple subsystems and the task count climbs past twenty, a single dependency graph becomes unwieldy. The tasks start serving different architectural concerns, the validation criteria fragment across unrelated domains, and the epic loses the coherence that makes it useful.

That realization led to maps, a new first-class construct in SPOQ that applies the same wave-based topological dispatch pattern at the program level. Instead of coordinating tasks within one epic, maps coordinate entire epics within a program. Each epic retains its own task graph, its own validation gates, and its own execution history. The map layer handles the question of which epics can run in parallel and which must wait for others to complete.

This feature shipped alongside something I have been working toward for months: SPOQ is now available on PyPI. You can install it globally and use it as both an MCP server for Claude Code and Cursor integration and as a standalone CLI for templating out SPOQ projects, computing waves, validating structures, and managing maps. One package, two interfaces, no additional infrastructure required.

Why Do Epics Alone Break Down at Program Scale?

Epics break down when a project spans multiple subsystems because cramming unrelated architectural concerns into a single dependency graph produces a monolithic plan that is difficult to validate, difficult to execute, and difficult to reason about. The problem is not that epics are poorly designed. The problem is that they are designed for a specific scope, and exceeding that scope undermines the properties that make them effective.

I ran into this firsthand while building Pinpoint. The platform has a Spring Boot API, a Next.js dashboard, a Rust CLI, a Rust Lambda render engine, and an MCP server for AI integration. Early on, I tried to plan a cross-cutting initiative as a single epic with tasks spanning all five subsystems. The dependency graph became a tangled web where API schema changes blocked CLI updates, which blocked MCP tool definitions, which blocked dashboard components. The graph was technically correct, but nobody could look at it and understand the execution order at a glance. Worse, a single failed validation in one subsystem blocked progress on subsystems that had no actual dependency on the failing work.

The natural solution was to split the work into separate epics, one per subsystem. But that created a coordination gap. The epics had inter-dependencies that needed explicit management. The API epic had to complete certain endpoints before the CLI epic could implement the corresponding commands. The render engine epic needed updated data models from the API before it could generate the correct report formats. Without a formal layer to express these relationships, I was managing them manually, which defeated the purpose of automated orchestration.

Maps fill that gap. They provide a structured way to declare inter-epic dependencies and compute execution waves at the epic level, just as SPOQ already computes waves at the task level within each epic. The hierarchy is clean: maps contain epics, epics contain tasks. Wave-based dispatch applies at both levels. Quality gates apply at both levels. The same principles that make individual epics work now apply to programs composed of multiple epics.

What Is the Structure of a SPOQ Map?

A SPOQ map is defined by a single MAP.md file that lives in a dedicated directory and contains eight sections: vision, program structure, epics, epic dependencies, dispatch strategy, success criteria, estimated effort, and risk assessment. The format mirrors the EPIC.md structure that SPOQ already uses for individual epics, scaled up to the program level.

The vision section captures the program-level objective in two to five sentences. This is not a restatement of individual epic goals. It describes the overarching outcome that the combined work across all epics will produce. For the Pinpoint example, the vision might be “Enable end-to-end automated bug lifecycle management from CI/CD dispatch through report generation, with AI agents capable of autonomous triage across all platform interfaces.” No individual epic achieves that alone. The program does.

The epics section lists each participating epic with its slug, current status, estimated hours, dependencies, and a brief summary. Each epic entry references a real SPOQ epic that has its own EPIC.md and task YAML files. The map does not duplicate their definitions. It points to them and declares the relationships between them. This referential approach means each epic remains a self-contained, independently executable unit. The map adds coordination without absorbing ownership.

The epic dependencies section declares inter-epic relationships as a directed acyclic graph, using the same notation SPOQ uses for task dependencies within epics. An epic with no dependencies lands in Wave 0 and can execute immediately. Epics that depend on Wave 0 epics form Wave 1. The topological sort is identical to the one SPOQ applies at the task level, just operating on epic slugs instead of task IDs. This consistency is deliberate. If you understand how SPOQ dispatches tasks in waves, you already understand how maps dispatch epics. No new concepts required.

The success criteria section is where maps provide something that independent epics cannot: program-level acceptance criteria that span epic boundaries. Individual epics have their own success criteria focused on their subsystem. The map adds criteria that verify integration across subsystems. Can the CLI successfully dispatch a test request through the API and receive a formatted report from the render engine? That end-to-end behavior cannot be validated within any single epic. Maps make it a first-class concern.

How Does Wave-Based Dispatch Scale from Tasks to Epics?

The same topological sort algorithm that computes task waves within an epic computes epic waves within a map, which means the parallelism gains that SPOQ achieves at the task level compound at the program level. A program with four epics where two are independent can execute those two in parallel, and each of those epics internally parallelizes its own tasks across waves. The speedup multiplies.

The implementation uses the same dependency graph analysis: cycle detection via depth-first search, wave assignment via topological sort, and critical path computation via dynamic programming on the longest chain. The functions are named differently to avoid confusion (compute_epic_waves versus compute_waves), but the logic is identical. I made this choice deliberately because the mathematical properties that make wave-based dispatch correct at the task level hold equally at the epic level. Dependencies are dependencies regardless of whether they connect tasks or epics. Parallelism is parallelism regardless of the unit of work.

The metrics are also analogous. At the task level, SPOQ reports parallelism factor (average wave width), critical path length, and estimated speedup over sequential execution. At the epic level, maps report the same metrics but for epic waves. A program with ten epics and a maximum wave width of four has a parallelism factor of roughly 2.5x, meaning that even though there are sequential dependencies, the program completes in significantly less wall-clock time than running all ten epics one after another.

One subtlety at the epic level is that the execution time per wave is not uniform. Task waves within a single epic tend to have similar durations because tasks are sized to the 1-4 hour range. But epics can vary dramatically in scope. A 5-hour epic and a 40-hour epic landing in the same wave means the wave takes 40 hours regardless. The critical path analysis accounts for this by using estimated hours rather than epic count, producing a more realistic total program duration estimate. This is why the MAP.md format requires estimated hours per epic. The data feeds directly into dispatch optimization.

What Does the SPOQ PyPI Package Provide?

The SPOQ package on PyPI installs two entry points: a spoq CLI for command-line project management and a spoq-mcp server for AI coding assistant integration. Both interfaces share the same underlying library, which means every capability available to AI agents through MCP is equally available to human operators through the terminal. One install, two ways to interact, and no additional infrastructure beyond Python 3.12 and pip.

The CLI is designed to be installed globally so it works across all your projects. Running spoq template epic “feature name” 8 generates a complete EPIC.md skeleton with eight tasks, dependency placeholders, and the standard section structure. Running spoq template map “program name” 4 generates a MAP.md skeleton with four epic entries, auto-wired dependencies based on wave position, and placeholders for every required section. The templating distributes epics across waves with roughly 30 percent in Wave 0, 40 percent in Wave 1, and the remainder in subsequent waves. Later-wave epics automatically depend on earlier-wave epics to produce a realistic starting DAG that you refine based on your actual architecture.

The MCP server exposes the full SPOQ toolset to AI agents through the Model Context Protocol. This includes eight map-specific tools: parsing, validation, wave computation, DAG analysis, effort estimation, status tracking, listing, and skeleton generation. Combined with the existing epic and task tools, the MCP server gives AI agents like Claude Code full visibility into your SPOQ project structure. An agent can parse a map, compute which epics are ready for execution, check the status of in-progress epics, and even generate new map skeletons when the scope of a conversation warrants program-level coordination.

Making the package available on PyPI was a priority because SPOQ requires zero custom infrastructure. The methodology layers on top of existing AI coding assistants like Claude Code and Gemini CLI. The MCP server connects to those tools through their native extension mechanisms. The CLI handles everything else. There is no hosted service, no database, no cloud account required. You install the package, configure your AI coding assistant to use the MCP server, and the entire SPOQ workflow is available. This keeps the barrier to adoption as low as possible, which matters for a methodology that is still establishing itself in the broader engineering community.

When Should You Use a Map Instead of a Single Epic?

Use a single epic when the total scope is under twenty tasks, all tasks share one architectural area, the dependency graph forms a single connected component, and one validation pass can cover the entire deliverable. Use a map when the scope exceeds twenty tasks spanning distinct subsystems, the work streams can execute independently within their own waves, each subsystem needs its own architecture documentation and success criteria, and program-level coordination is needed across epic boundaries.

The twenty-task threshold is not arbitrary. It comes from practical experience across the twelve completed SPOQ epics in production use. Epics under twenty tasks maintain a dependency graph that fits comfortably in a single visualization. The planning validation can assess the full graph coherently. Agents executing tasks within the epic can hold the relevant context without exceeding their effective working memory. Once you cross twenty tasks, these properties start degrading. The graph becomes harder to reason about visually. Planning validation scores drop because the epic tries to cover too many unrelated concerns. Agent context windows fill with information about subsystems unrelated to their current task.

The subsystem test is the stronger signal. If you find yourself writing tasks that touch entirely different codebases, different deployment targets, or different technology stacks, those tasks belong in separate epics coordinated by a map. At Pinpoint, the Rust CLI, the Java API, and the TypeScript dashboard are separate codebases with separate build systems and separate deployment pipelines. Grouping their tasks in one epic creates false coupling. Splitting them into separate epics coordinated by a map preserves the natural boundaries while still managing the inter-dependencies between them.

The epic-planning skill in SPOQ now includes a scope escalation check. When task decomposition produces more than twenty tasks spanning multiple subsystems, the skill recommends switching to a map-based approach and can generate the skeleton automatically. This catches scope bloat at planning time rather than discovering it during execution when the cost of restructuring is much higher.

What Does the Validation and Testing Infrastructure Look Like?

Maps use the same validation-first philosophy that governs the rest of SPOQ. Structural validation checks that the MAP.md contains all required sections, that every epic entry has the mandatory fields, and that success criteria span epic boundaries rather than duplicating individual epic criteria. Dependency validation confirms that all referenced epic slugs actually exist and that the dependency graph is acyclic. These checks run before any execution begins, catching structural problems when they are cheapest to fix.

The cycle detection algorithm is the same depth-first search used for task dependency validation. An inter-epic cycle would mean that Epic A depends on Epic B which depends on Epic A, creating a deadlock where neither can start. This should never happen in a well-designed program, but catching it statically is essential because the failure mode at runtime would be a silent hang where the orchestrator waits indefinitely for a dependency that can never be satisfied.

The testing infrastructure for maps includes twenty-five unit tests organized across eight test classes. The test fixtures model a three-epic linear pipeline where an alpha service feeds a beta API which feeds gamma integration tests. Each fixture epic has its own EPIC.md and task YAML files for realistic resolution testing. The tests cover parsing, validation, wave computation, DAG analysis, effort estimation, status rollup, directory scanning, and skeleton generation. Every map feature shipped with test coverage on day one, consistent with the SPOQ principle that work is not delivered unless tests verify the functionality actually works.

Running the full test suite after adding maps confirmed zero regressions across the existing 149 tests. The 25 new map tests bring the total to 174. This matters because maps touch the core library functions for parsing and validation, so any unintended side effects would surface as failures in the existing epic and task tests. Clean test runs across the full suite give me confidence that maps integrate cleanly without destabilizing the foundation they build on.

What Comes Next for SPOQ Maps?

Maps are production-ready as of this week, but they represent the beginning of program-level orchestration rather than the end. The immediate next step is dogfooding maps on a real multi-epic initiative to collect the same kind of structured benchmark data that validated task-level wave dispatch. I want parallelism metrics, rework rates, and speedup factors at the epic level with the same rigor that the SPOQ paper documents at the task level. That data will either confirm that the theoretical benefits hold in practice or reveal adjustments needed for real-world program coordination.

The PyPI distribution also opens possibilities for community adoption. SPOQ has been a methodology I developed and used primarily on my own projects. Making it installable with a single pip command removes the friction of manual setup and lets other engineers try the workflow on their own terms. The MCP server means that anyone using Claude Code or a compatible AI coding assistant can immediately start using SPOQ tools without modifying their existing setup. The CLI means they can also interact with SPOQ structures directly from their terminal, independent of any AI tool. Both paths are fully functional from the same package.

If you are working on a project that has outgrown a single planning document, or if you are coordinating multiple work streams that have real dependencies between them, maps might be exactly what you need. The methodology is documented at spoqpaper.com, and the package installs in seconds from PyPI. I built SPOQ because I needed it. Maps exist because epics alone were not enough. If your projects are hitting the same ceiling, the tooling is ready.

Related Posts

Interested in scaling multi-agent orchestration across your engineering programs? Schedule a conversation to discuss how SPOQ maps can coordinate your team’s work across subsystems.

← All posts