Systems Design

The Stack

The LLM is a cloud of possibilities. I build systems that harness it and focus it on specific, reliable outcomes.

A model left to itself is non-deterministic — the same prompt returns different results every time. That non-determinism is also the foundation of a legal argument: if the model cannot reliably produce a specific output, it cannot claim ownership of one. Value requires choices. The choices are mine.

Every layer below represents a deliberate design decision — not configuration, not tutorial-following. Context architecture, model routing, failure pattern recognition: the engineering disciplines that make agentic AI systems reliable. The model is one commodity input among many.

The Architecture

ai-stack

┌───────────────────────────────────────────┐
│  HUMAN INTERFACE LAYER                    │
│  iTerm theme → cascades into CC output    │
│  WCAG contrast · session titling ·        │
│  capture-pane inter-session visibility    │
├───────────────────────────────────────────┤
│  SESSION ORCHESTRATION                    │
│  tmux · fork/spawn · inter-session state  │
├───────────────────────────────────────────┤
│  EVENT / HOOK LAYER                       │
│  82 hooks · lifecycle management ·        │
│  automated context capture · triggers     │
├───────────────────────────────────────────┤
│  COORDINATION LAYER                       │
│  coordination.db · work_items ·           │
│  multi-agent state · async job queues     │
├───────────────────────────────────────────┤
│  MODEL ROUTING                            │
│  Haiku / Sonnet / Opus ·                  │
│  token economics · cost-per-outcome       │
├───────────────────────────────────────────┤
│  MODELS  (commodity input)                │
│  Claude · Gemini · GPT                    │
└───────────────────────────────────────────┘

Default Claude Code has none of this. The delta between raw model output and output produced inside this stack is entirely attributable to the architecture — not the model.

The Human Interface Layer

Most infrastructure builders skip directly to the model layer. The human operator is also a system component — with latency, bandwidth, and error rate — and that component needs optimization too.

Visual Design System

Custom iTerm themes built to WCAG contrast standards cascade through to Claude Code's output rendering. Readability is a performance variable — higher signal, lower cognitive load, faster comprehension.

Session Awareness

Window titling and tmux capture-pane give the human operator spatial orientation across parallel sessions. Context-switching cost drops when you can see what each session is doing without interrupting it.

Coherent Environment

These aren't independent tweaks. The theme cascades. The session names inform the hook system. The visual layer reinforces the coordination layer. It was designed to work together.

The Design Philosophy

"I aligned the tools to work together to suit me." That alignment is what distinguishes a system from an accumulation of configurations.

Why This Protects Your Work

As model vendors assert IP claims over AI-generated output, companies using raw model APIs are exposed. The argument is simple: the model produced the value, we want a share.

Work produced through this stack is defensible by design. The model is one commodity input among many — the equivalent of electricity in a factory. The routing logic, context management, multi-agent coordination, and verification loops are deliberate engineering decisions, traceable in git history, attributable to human authorship at every layer.

The delta between "GPT out of the box" and "GPT inside this infrastructure" is measurable. That delta is where the value lives. That delta is mine — and when I build for you, it protects yours.

Built for Auditability

Government and enterprise procurement teams ask the same question in different vocabulary: can you show your work? This stack was designed to be understood, not just to function. That design choice turns out to satisfy a lot of compliance requirements without being built to satisfy them.

Human Interface Layer → Operator Accountability

Every decision point is visible to a human operator. Session titling, capture-pane visibility, and WCAG-compliant rendering aren't cosmetic — they ensure the operator can see what the system is doing before outcomes ship. NIST SP 800-218 calls this "operator transparency." This layer is where that obligation is fulfilled.

Session Orchestration (tmux) → Reproducible Audit Trail

tmux sessions can be captured, replayed, and inspected. Every agent session is a named, documented unit of work — not an anonymous process. When EO 14028 asks for attestation of software development practices, the session history is the attestation. The log exists. It's structured. It's reviewable.

Event / Hook Layer (82 hooks) → SBOM-Ready Component Enumeration

Every system event has a named hook. Every hook is a documented component with known inputs and outputs. This is SBOM (Software Bill of Materials) compliance by construction — the component list isn't generated after the fact, it's baked into how the system is built. EO 14028 mandates software supply chain transparency; this layer makes that transparent by design.

Coordination Layer (SQLite) → Tamper-Evident State, No Lock-In

The coordination database is the context architecture layer — a queryable, open-standard record of what every agent knows, what work is in progress, and what has been resolved. SQLite means no proprietary state format, no vendor-controlled API required to read what the system has done. FedRAMP shared responsibility hinges on understanding what each system component is doing — SQLite makes that legible to anyone, including an auditor who doesn't have access to your tooling.

Model Routing (Haiku / Sonnet / Opus) → Cost Governance, Right-Sized Capability

Routing decisions are explicit and documented: this task class routes to this model tier, for these reasons. That's cost and token economics in practice — not just spend control, but defensible justification for why capability was matched to task. Government procurement frameworks increasingly require that AI systems demonstrate proportionality. The routing layer is where proportionality is implemented, not claimed.

Models as Commodity Input → Vendor Independence

When the models are at the bottom and the architecture is above them, swapping a model doesn't require rewiring the system. Claude, Gemini, and GPT are all plausible inputs. This is vendor independence in the meaningful sense — not just a talking point, but a structural property that survives procurement cycles and model deprecations.

"This isn't a compliance retrofit. It's what you get when the architecture is designed to be understood."

Build Something Defensible

If you're building with AI and want infrastructure that creates clear, attributable value — let's talk.

Work With Me Get in Touch