NXA AI Operating System, Multi-Agent Production-Grade Architecture
A markdown-driven AI operating system orchestrating 8 parallel sub-agents across LinkedIn outreach, content generation, inbox triage, and deep research. Self root-caused Anthropic Claude Code Issue #9458, shipped a complete multi-tenant LinkedIn Automation product as a downstream artifact, and runs multi-million-dollar-token research workflows on demand.
TL;DR
A markdown-driven AI operating system orchestrating 8 to 10 specialised sub-agent types, with dozens of sub-agent instances running concurrently across LinkedIn outreach, content generation, inbox triage, and deep research. Built for AI-native agency founders and operators who need a system that ships products, not just chat replies. No build step, no runtime, no proprietary dependencies. All state, identity, and agent definitions live in plain markdown tracked in git.
The Problem
Single-agent AI tools collapse on real business operations. The context window fills in 30 minutes. State evaporates between sessions. Multi-step workflows lose their audit trails. The agency founder running a multi-channel revenue engine (LinkedIn outreach, content production, inbox triage, client research, codebase maintenance) cannot keep doing it in a chat window. They need an operating system, not an assistant.
Specifically, the system had to deliver:
- Persistent identity that survives session resets (company voice, founder profile, current sprint, append-only decision log).
- Parallel sub-agents that can be spawned, monitored, and return structured results without blowing the parent context window.
- Controlled autonomy where different files carry different write permissions, with founder-approval gates enforced at the tool layer.
- Native git tracking of all state changes, so any decision weeks later is auditable by line.
Outcome
The OS has become a force multiplier across every front it touches. The strongest proofs to date:
- Built a complete multi-tenant LinkedIn Automation product end-to-end as a downstream artifact: database schema, FastAPI backend, Celery worker, Next.js dashboard, rolling-window rate limiter, distributed locks, webhook orchestration.
- Self root-caused Anthropic Claude Code Issue #9458 (cross-platform sub-agent silent-write bug). The OS instrumented the failure, isolated the cause, and filed the public issue without manual intervention.
- Runs multi-million-token deep-research workflows dispatching parallel sub-agents against complex discovery and planning goals across any domain, from market intelligence to architectural decisions.
- Manages multi-month, multi-platform social media campaigns (LinkedIn primary, others extending) with consistent delivery and 50% to 5x improvement cycle-over-cycle across outreach, content, and inbox throughput.
- Architecture hardened by issue-driven engineering. Every reliability bug discovered in production becomes a system-wide rule encoded into the operating manual, so the OS gets safer the longer it runs.
- Zero proprietary lock-in. State is git-trackable markdown, agents are markdown, hooks are bash. Any operator can fork, audit, or migrate the OS with no vendor dependency.
Tech Stack
| Component | Technology | Purpose |
|---|---|---|
| OS substrate | Markdown + git | All state, identity, and agent definitions live in plain files |
| Orchestration runtime | Claude Code (Opus 4.7, 1M context) | Reads CLAUDE.md, dispatches sub-agents, executes tool calls |
| Sub-agent runtime | Anthropic Agent SDK + MCP | Parallel sub-agent dispatch with structured JSON return contracts |
| Persistence | Supabase Postgres with RLS | Pipeline state for outreach, drafts, audit logs |
| Edge compute | Cloudflare Workers + Hono | Inbound webhooks, leak detection, draft persistence |
| Security layer | Bash + PowerShell hooks | PreToolUse / PostToolUse hooks for top-secret writes, audit log, leak detection |
Key Functionality
- 8 to 10 specialised sub-agent types (linkedin-outreach, linkedin-followup, linkedin-posts, dev-agent, evaluator, deep-researcher, chief-of-staff, setup-wizard, content-week, research-lead) with dozens of sub-agent instances launching concurrently per workflow. Each has a capped 5-file workspace.
- Slash commands for every recurring operation (/morning, /evening, /weekly-review, /smart-compact, /checkpoint, /security-audit, /content-week, /research-lead and more).
- Pre and PostToolUse hooks that block top-secret writes, audit every tool call to JSONL, and scan outbound drafts for leak patterns before they ship.
- Skills system that loads task-specific procedures (visuals, content-week, propose-secret-edit) only when relevant, so the boot context stays lean.
- MCP-driven tool integration (Supabase queries, web research, doc fetch) with structured return contracts.
- Karpathy-style wiki pattern:
raw/holds immutable source documents,wiki/holds AI-maintained compiled knowledge that grows and cross-references itself across sessions. - Append-only decision log so every choice (do, defer, reject) is auditable weeks later by file and line.
- Carry-forward snapshots that survive compaction events, preserving anchors, credentials map, active commitments, and “what to work on next” across sessions.
- All state git-tracked. Every change is a commit. Roll back any decision, audit any change, fork the entire OS.
- And many more amazing functionalities layered in over months of production use, from cron-driven autonomous loops to deep-research sub-agent dispatch with structured verification queries.
Terminal session, decision log, sub-agent dispatch trace, wiki cross-reference graph
60-second tour, OS booting and a sub-agent dispatch
Building something similar?
If a multi-agent pipeline, voice AI deployment, or production automation system is on your roadmap, let's talk through how this applies to your context.