@vial-agent/runtime — live on npm · Helix on MPPScan

AI agents that
never need babysitting

They fail. They fix themselves. They learn.
VialOS gives any AI agent self-healing DNA + persistent memory.

70%
agent failure rate on real tasks
CMU 2025
95%
of AI pilots yield zero P&L impact
MIT 2025
42%
of companies abandoned AI in 2025
S&P Global

The problem isn't the model. It's that agents learn nothing from failure.

▶ See it worknpm install @vial-agent/runtime
The Math Problem

Compound failure kills agents

// Agent accuracy 85% on a 10-step task:
85%×85%×85%×85%×85%×85%×85%×85%×85%×85%=19.7%success rate
80% of the time: failure. And the agent learned absolutely nothing.
Without VialOS
Agent fails on same error — Day 1
Agent fails on same error — Day 100
Every failure is manual intervention
Zero institutional learning
You are the error handler
With VialOS
Failure detected → PCEC diagnoses
Gene Map: best strategy (96% confidence)
Repaired on attempt 1
Knowledge persists forever
You do nothing
+ vialos warmup
Run once before first deployment
50 simulated failures in sandbox
Gene Map pre-seeded with 16+ patterns
Day 1: agent already knows what works
Cold start problem: eliminated
Why we built this
DoorDash — 4 years, ML Infrastructure
// A normal Tuesday in ML infra
Sentry
agent_timeout · 47 new occurrences
engineer investigates manually
Scalyr
duplicate error log storm · ~200/min
triage, find root cause
CI/CD
pipeline failed · same flaky test
manual re-run #3
E2E
checkout flow broken · unknown cause
page on-call engineer
Datadog
memory spike · service restart needed
ssh in, restart manually
Tomorrow: same errors. Same manual fixes. Nothing learned.
The irony: DoorDash had exceptional data quality — some of the best ML data in the industry. But every day, engineers were manually handling the same errors. The data was there to learn from. The system couldn't.

Great observability.
Zero self-healing.
The gap nobody fixed.

Datadog shows you the error. Sentry captures the stack trace. Scalyr logs every occurrence.

Then an engineer gets paged. They fix it manually. Two weeks later: same error, same page, same fix.

Nobody built the layer that connects observation to action — that learns what fixed it and applies that knowledge next time.
Observe
Datadog · Sentry · Scalyr
?
gap
Fix + Learn
VialOS
"Something broke" → "Fixed itself, won't break the same way again"
At xAI, running Grok-3 on 100,000 GPUs, I saw how this gets solved at hyperscale — proprietary systems, completely closed, never shared.

VialOS is the self-healing layer that DoorDash needed. MIT open source. One line of code. For every team that can't build their own Colossus.
"I spent 4 years watching great engineers waste their time on the same alerts, the same manual restarts, the same errors. The data to fix it automatically was all there. The system just couldn't learn. VialOS is that system."
— Adrian Bai, founder · 4 years ML Infra @ DoorDash · Grok-3 core training team @ xAI
4 yrs
lived this pain
100K+
GPUs at xAI
500M+
users at X
Interactive Demo

See VialOS in action

From one agent to an autonomous company — the full vision

vialos — agent reliability runtime
ready
$
Core Engine

PCEC — self-repair loop

Runs automatically on every failure. Gets better every iteration.

P
Perceive
Classify the failure. Query Gene Map for matching patterns. Assess severity.
C
Construct
Generate repair strategies. Gene Map patterns rank by past success rate.
E
Evaluate
Score each strategy with Q-learning. Select highest confidence path.
C
Commit
Execute repair. Record outcome. Q-value updates. System learns immediately.
Gene Map — The Memory Layer
Not just memory. Expertise with success rates.
Vector databases store "I've seen this." Gene Map stores "this strategy works 96% of the time." Every failure updates Q-values in real-time. The more it runs, the more accurate it gets.
Run 13 strategies triedq=0.70
Run 10best strategy selectedq=0.85
Run 50instant fix, first tryq=0.97
Market Opportunity

Where self-healing is needed now

Three markets need this today. Helix (payments) is our first vertical and future bet.

Who buys VialOS

Companies that run on open source tooling — not companies that build everything themselves.

✓ Our customer (Buy/Use philosophy)
DoorDashOSS-first infra, heavy Datadog/Sentry/Scalyr users
LyftOpen-sourced Amundsen, Flyte, Cartography
AirbnbAirflow, Superset, Nerve — built for the community
StripeBrilliant infra team, uses best-in-class OSS tools
Linear / Vercel / ClerkSmall eng team, engineering excellence culture
Series A-C AI companies20-150 engineers, already running agents
Stack: Datadog + Sentry + GitHub Actions + Scalyr/Coralogix + PagerDuty
Pain: same errors, manual fixes, no learning, engineers paged at 3am
✗ Not our customer (Build everything)
UberBuilds its own service mesh, internal ML platform, even JIRA replacements
AmazonDoesn't trust external tools — if it exists, AWS will build it
xAI / Anthropic / OpenAITheir infra IS their competitive advantage — can't open source it
Google / MetaColossus, Borg, Tupperware — all internal, all locked
Why they won't compete: their self-healing infra is internal, proprietary, and a moat they'll never open source.
They validate the problem. We serve everyone else.
⚙️
Code & Dev Agents
URGENCY: NOW
$25.7B by 2030
market size
The Pain
Claude Code, Cursor, Copilot — used by millions of developers daily. Build errors, test failures, lint errors happen constantly. Every failure interrupts a human.
VialOS Solution
Gene Map learns: TypeScript errors, build failures, test assertions. Run 1: 3 strategies tried. Run 50: instant fix. Developer never interrupted.
Data: AI code gen market growing 46% CAGR
Product

One runtime. Four user types.

VialOS scales from a single agent to an autonomous company.

👨‍💻 I want my agent to stop failing
Developer journey
1
npm install @vial-agent/runtime
One line. Your agent gets self-healing DNA.
2
Agent hits an error
PCEC kicks in automatically. No config needed.
3
Gene Map learns the fix
Next time same error → instant repair. You do nothing.
4
Agent gets smarter every run
Q-values update. Confidence grows. Failures drop to near-zero.
Final Form

Autonomous Company System

23 agents. 6 squads. One objective.
Zero human interventions required for execution.

Agent Roster (24)
Leadership
ChairAgent 🪑
IntelAgent 🔭
SecretaryAgent 📝
Engineering
CodeAgent ⚙️
ArchitectAgent 🏗️
ReviewAgent 🔍
TestAgent 🧪
DebugAgent 🐛
Security
SecurityEngineer 🔒
AuditAgent 🛡️
ComplianceAgent 📋
Operations
DevOpsAutomator 🚀
MonitorAgent 📡
IncidentCommander 🚨
PerformanceAgent ⚡
Growth
GrowthAgent 📈
ContentAgent ✍️
AnalyticsAgent 📊
SEOAgent 🔍
Support
CustomerAgent 💬
DocsAgent 📚
OnboardAgent 🎯
FeedbackAgent 🗣️
TriageAgent 🏥
AMP MESSAGE STREAM0 messages
Waiting for simulation...
PCEC LOG
No failures yet
GENE MAP
14
patterns accumulated
nonce_errrate_limittimeouttest_failctx_overflow
Organization

23 Agents. 6 Squads. Your execution team.

Every agent communicates via AMP Protocol. Every failure heals via PCEC.

🪑
Leadership Squad
3 agents
ChairAgent
Orchestrates all squads, decomposes objectives
IntelAgent
Competitive intelligence, tech landscape scanning
SecretaryAgent
Meeting notes, summaries, status reports
⚙️
Engineering Squad
5 agents
CodeAgent
Implementation, feature development
ArchitectAgent
System design, module planning
ReviewAgent
Code review, quality assurance
TestAgent
Test generation, coverage analysis
DebugAgent
Bug diagnosis, PCEC-assisted fixes
🔒
Security Squad
3 agents
SecurityEngineer
OWASP scanning, vulnerability detection
AuditAgent
Compliance checks, access reviews
ComplianceAgent
Regulatory monitoring, policy enforcement
🚀
Operations Squad
4 agents
DevOpsAutomator
CI/CD, deployment, infrastructure
MonitorAgent
Production monitoring, alerting
IncidentCommander
Incident response, escalation
PerformanceAgent
Load testing, optimization
📈
Growth Squad
4 agents
GrowthAgent
User acquisition, funnel optimization
ContentAgent
Blog posts, documentation, marketing
AnalyticsAgent
Metrics tracking, reporting
SEOAgent
Search optimization, keyword analysis
💬
Support Squad
5 agents
CustomerAgent
Customer interactions, issue resolution
DocsAgent
API documentation, guides
OnboardAgent
New user onboarding flows
FeedbackAgent
User feedback collection, analysis
TriageAgent
Issue classification, routing
All 23 agents share one Gene Map. Knowledge from any agent benefits every agent.
System Architecture

How it all connects

Click any component to learn what it does and how data flows through it.

👆
Click any component to learn what it does
USER / AGENT
Your Agent
any LLM agent
Claude Code
Phase 1 adapter
ClaudeCodeAdapter
1-line integration
VIALOS RUNTIME
PCEC Self-Repair Engine
P
Perceive
Diagnose
C
Construct
Strategies
E
Evaluate
Q-scoring
C
Commit
Execute+learn
Gene Map
SQLite · Q-learning · persistent memory
Gene Dream
orient→gather→consolidate→prune
AMP Protocol Bus
agent-to-agent · structured messages
4-Layer Compaction
context window mgmt
Model Adapter
Claude · OpenAI · Ollama
TOOLS
Bash
shell execution
File I/O
read / write
Web Fetch
HTTP requests
MCP Client
external tools
Gene Capsule
capture i/o
MODELS
Claude
Anthropic API
OpenAI
Phase 2+
Ollama
local models
VERTICAL
Helix
payment healing · MPPScan · $0.01/heal
Your Vertical
DevOps · Legal · Data · Customer...
Proof

VialOS is its own best customer

This isn't a demo of what VialOS could do someday. VialOS agents built VialOS. Every line of code, every test, every deploy.

⚙️
CodeAgent
Wrote @vial-agent/runtime source code
8d8d6a4
🔍
ReviewAgent
Reviewed every PR in this repo
30c9c26
🚀
DevOpsAutomator
Set up CI/CD + Railway deploy
db51f80
🔭
IntelAgent
Found KAIROS in Claude Code source
internal
🧪
TestAgent
Generated 522 tests across 54 files
c8a533f
📝
SecretaryAgent
Produces this showcase you're reading
ongoing
One person. Backed by 23 agents.
Adrian (founder) sets direction and approves key decisions. VialOS handles everything else — coding, reviewing, testing, deploying, monitoring.

This is what VialOS Cloud sells you: the same system, for your company.
Roadmap

From runtime to operating system

01
Phase 01: Self-Healing Runtime
✅ LIVE
PCEC engine — 4-phase self-repair loop
Gene Map — persistent pattern memory with Q-learning
AMP Protocol — agent-to-agent messaging
@vial-agent/runtime on npm
Already powering Helix (payment healing, live on MPPScan)
02
Phase 02: Autonomous Company
🔨 BUILDING
23-agent system with 6 specialized squads
ChairAgent orchestration + delegation
Gene Dream — offline learning from failure capsules
KAIROS autonomous mode
Cross-agent Gene Map sharing
03
Phase 03: VialOS Cloud
📐 DESIGNED
Multi-tenant Gene Map (PostgreSQL → distributed)
Cloud dashboard — agent monitoring, Gene Map analytics
Enterprise SSO, audit logs, compliance
Usage-based pricing: per heal, per observe, per agent
SDK for any framework: LangChain, CrewAI, AutoGen
04
Phase 04: One-Person Company OS
🌟 WHERE WE'RE GOING
You define the goal. VialOS defines, executes, and delivers the how.
Full company operations from a single prompt
Agents that hire, fire, and manage other agents
Distributed Gene Map — global failure intelligence
Self-improving agent population via Gene Dream
Memory Architecture

Persistent memory that scales

Yes, SQLite has scale limits. Here's the roadmap.
Gene Map stores failure→strategy→confidence triplets. Not embeddings. Not chat logs. Actionable expertise.

SQLite
CURRENT
Single-file embedded database. Zero config. Ships with the runtime.
Strengths
Zero setup — works immediately
File-based — easy to backup/inspect
Fast for single-agent workloads
Tradeoffs
Single writer — no concurrent multi-agent writes
No network access — local only
Scale ceiling ~100K patterns
PostgreSQL
PHASE 3
Full relational database. Multi-tenant. Cloud-ready.
Strengths
Concurrent multi-agent writes
Full SQL — complex Gene Map queries
Enterprise-grade reliability
Tradeoffs
Requires server infrastructure
Higher operational complexity
Distributed
PHASE 4
Global Gene Map. Every agent on every team benefits from every failure everywhere.
Strengths
Cross-organization learning
Global failure intelligence network
Near-infinite horizontal scale
Tradeoffs
Requires careful privacy/isolation design
Eventual consistency tradeoffs
4-Layer Memory Compaction
Hot
Last 5 runs
100%
Warm
Last 50 runs
Top patterns
Cold
All time
High-confidence only
Archive
Gene Dream
Distilled recipes
Traction

Already in motion

Live
on MPPScan
Agent-native payment healing discoverable
522
tests passing
54 test files, CI/CD on GitHub Actions
0.3.0
@vial-agent/runtime
npm — any agent, one line
Coinbase
in conversation
COO intro, blog post pipeline
Privy
wants demo
Stripe subsidiary, Archetype portfolio
Tempo
PR submitted
gakonst (CTO) review pending
Helix — Proof That VialOS Works in Production
Self-Healing Payment SDK — Live on MPPScan
Helix is the first vertical built on VialOS Runtime. It proves the architecture works in a real production environment — real transactions, real failures, real self-healing. The agent payment market is nascent today, but Helix gives us a live proof point as it scales.
POST /heal
$0.01 per repair
POST /observe
$0.001 per check
GET /gene-map
FREE public

Stop babysitting your agents.
Let them learn.

One line of code. Self-healing DNA. Persistent memory.
Your agent gets smarter every single run.

Star on GitHub
npm install @vial-agent/runtime

MIT License · Open source core · Helix on MPPScan