Architecture
Overview
+-----------------------------------------------------+
AI Agent (MCP Client)
+--------------------------+--------------------------+
| SSE (MCP Protocol)
v
+-----------------------------------------------------+
Butt-Dial MCP Server +----------+ +----------+ +-----------------+ MCP Tools Webhooks Admin UI (15 tools) (6 routes) Setup/Dashboard +-----+----+ +-----+----+ +-----------------+ +-----+--------------+---------------------------+ Security Layer Auth Sanitizer Rate Limiter Compliance +-----+--------------------------------------+---+ +-----+--------------------------------------+---+ Provider Interfaces Telephony Email WhatsApp TTS STT DB +-----+--------------------------------------+---+ +-----+--------------------------------------+---+ Provider Adapters Twilio Resend ElevenLabs SQLite S3 +-------------------------------------------------+
+-----------------------------------------------------+
Tech Stack
- Runtime: Node.js 22+ / TypeScript
- MCP: @modelcontextprotocol/sdk (SSE transport)
- HTTP: Express 5
- Database: SQLite (dev) / Postgres (production)
- Telephony: Twilio (default), Vonage
- Email: Resend
- TTS: Edge TTS (free), ElevenLabs, OpenAI
- Voice: Twilio ConversationRelay
Data Flows
Outbound Message
Agent -> comms_send_message -> Auth -> Sanitize -> Compliance Check
-> Rate Limit -> Provider.send() -> Log Usage -> Return Result
Inbound Message
Provider Webhook -> Signature Verify -> Replay Check -> Channel Block Check
-> Parse -> Forward to Agent (or store in dead_letters if offline)
Live Voice Call
Inbound Call -> Webhook -> ConversationRelay TwiML -> WebSocket
-> Human speaks -> STT (by Twilio) -> Text to Agent (MCP sampling)
-> Agent responds -> Text to Twilio -> TTS (by Twilio) -> Human hears
Answering Machine Fallback
Inbound Call -> WebSocket -> Agent not connected
-> Built-in Anthropic LLM as answering machine
-> Collects message -> Stores in dead_letters
-> Agent reconnects -> Dead letters dispatched
Key Components
| Component | Description |
|---|---|
| MCP Server | Registers tools, handles SSE transport |
| Provider Factory | Resolves config to adapter instances |
| Auth Guard | requireAgent() / requireAdmin() |
| Token Manager | Token generation, hashing, verification |
| Sanitizer | Input validation (SQL injection, XSS, etc.) |
| Rate Limiter | Per-agent action and spending limits |
| Compliance | Content filter, DNC, TCPA, CAN-SPAM, GDPR |
| Voice Sessions | In-memory store for active call configs |
| Voice WebSocket | WebSocket handler (setup/prompt/interrupt/dtmf) |
| Agent Registry | Maps agentId to MCP Server session |
| Message Dispatcher | Dispatches dead letters on agent reconnect |
| Channel Blocker | Per-channel blocking without deprovisioning |
| Metrics | Prometheus counters and gauges |
| Audit Log | SHA-256 hash-chained event log |
| Alert Manager | Severity-routed alerts |
| Anomaly Detector | Volume spikes, brute force, rapid rotation |
Concurrency Model
- Stateless tool execution — no shared in-memory state between requests
- Provider-level parallelism — multiple agents operate simultaneously
- Channel independence — SMS, voice, email, WhatsApp on separate providers
- Database as coordination layer — SQLite transactions for dev, Postgres for production
Scaling path: single-process SQLite -> multi-process Postgres -> horizontal stateless workers.
Configuration Modes
Identity Model
| Mode | Description |
|---|---|
| Dedicated (default) | Each agent gets own phone, WhatsApp, email |
| Shared Pool | Agents share a pool of numbers |
| Hybrid | Shared by default, dedicated as upgrade |
Tenant Isolation
| Mode | Description |
|---|---|
| Single Account + DB Routing | One provider account, isolation in DB |
| Subaccount per Agent | Each agent gets own provider subaccount |
| Subaccount per Customer | Each tenant gets a subaccount |