AI Orchestration

Multi-Agent Coordination System

A coordination layer that enables parallel AI workers with shared memory, task delegation, and conflict detection — reducing complex engineering sessions from 6+ hours to 2 hours.

The Problem

AI coding assistants are single-threaded by design. When Claude Desktop researches an approach, it can't simultaneously implement another. When it hits a resource limit, the accumulated context — file reads, architectural decisions, debugging state — is lost. You start over.

For complex tasks that require both research and implementation, this single-threaded constraint means constant context-switching: research a pattern, stop, switch to implementation, lose the research context, switch back. A task that should take 2 hours stretches to 6+ because the cognitive overhead of rebuilding context dominates the actual work. The failure patterns are consistent: context degradation as windows fill, spec drift when agents lose alignment, and silent failure when handoffs miss.

The Approach

The solution is architectural: separate orchestration from execution, give agents shared memory, and build coordination primitives that prevent collisions.

Orchestration Layer

A lead agent (Claude Desktop or Claude Code) maintains the high-level plan. It decomposes tasks, assigns them to execution agents, and monitors completion. It doesn't do the work — it coordinates it.

Execution Agents

Claude Code sessions running in tmux panes handle implementation. Each agent works in isolation on its assigned task. When it completes, it writes results to shared memory and signals the orchestrator.

Shared Memory & Coordination

SQLite coordination databases implement the context architecture — a shared, queryable record of task state, file locks, and session decisions that every agent can read and write. MCP servers provide structured tool access. Cross-session prompt injection enables the orchestrator to send instructions to running agents without manual intervention.

Model Routing

Not every task needs the same model. Mechanical implementation tasks route to Haiku (fast, cheap). Architectural decisions route to Opus (expensive, correct). This is cost and token economics in practice — matching model capability to task complexity reduces cost 40-60% while maintaining quality where it matters.

Key Insight: Cowboy Coordination

The most surprising finding was that minimal coordination often outperforms heavy coordination. When agents work on truly independent tasks with clear boundaries, the overhead of sophisticated synchronization protocols exceeds the cost of occasional conflicts.

"Cowboy coordination" — where agents work independently and resolve conflicts after the fact — works well when tasks are naturally serialized by resource constraints. The system evolved from heavy-weight task management toward lightweight conflict detection with post-hoc resolution. Failure pattern recognition — knowing when coordination overhead exceeds its value, when context is degrading, when cowboy approaches are creating more conflicts than they prevent — turned out to be more important than protocol selection.

Results

Productivity gain on complex debugging sessions

6h → 2h

Documented session time reduction

40-60%

Cost reduction via model routing (Haiku for mechanical tasks)

82+

Custom hooks, scripts, and coordination tools built

The coordination system runs daily across all development work. Parallel agents handle implementation while the orchestrator maintains architectural coherence. The model routing system prevents quota exhaustion by matching task complexity to model capability. The entire infrastructure is self-documenting — every session captures decisions, patterns, and failures for future context recovery.

Technologies

Claude API

MCP Servers

Python

TypeScript

SQLite

tmux

Bash

iTerm2

Haiku/Sonnet/Opus

Deep Dives

Breaking Claude Desktop's Single-Threaded Bottleneck CLI-First AI Workflows Coordinating AI Agents: From Cowboy Coordination to SQLite The Non-Determinism Defense

← All Case Studies