Voice Calls

How It Works

The MCP server is infrastructure only — it relays text between the caller and your AI agent, but never generates AI responses itself. Your agent is the brain; the server is the telephone.

Live Voice Call Flow

Inbound Call

-> Twilio webhook hits /webhooks/:agentId/voice

-> Server returns ConversationRelay TwiML

-> Twilio opens WebSocket to /webhooks/:agentId/voice-ws

-> Human speaks -> Twilio STT -> Text

-> Text sent to your AI agent (via MCP sampling)

-> Agent responds with text

-> Text sent to Twilio -> Twilio TTS -> Human hears

Twilio handles STT and TTS. The server only passes text back and forth.

Three Response Paths

PathWhenWhat Happens
Agent SamplingAgent connected via SSECaller's speech goes to agent via MCP
Answering MachineAgent not connected, Anthropic key setBuilt-in Claude fallback collects message
Hard-coded FallbackAgent not connected, no keyPlays "unavailable" message

Making Outbound Calls

{

"agentId": "my-agent",

"to": "+15559876543",

"greeting": "Hi, this is your AI assistant calling about your appointment.",

"systemPrompt": "You are a friendly appointment reminder assistant."

}

Once answered, a live two-way conversation begins using the same ConversationRelay flow.


Answering Machine

When the AI agent is not connected (8-second timeout):

  1. Apologizes to the caller on behalf of the agent
  2. Asks for message and preferences (e.g., "call me back after 8 AM")
  3. Stores everything in the dead letter queue

When the agent reconnects, dead letters are automatically dispatched via comms_get_waiting_messages.

ANTHROPIC_API_KEY=sk-ant-...    # Required for answering machine

Without the key, callers hear a hard-coded "unavailable" message.


Voice Messages (TTS)

Pre-recorded messages (not live conversations):

{

"agentId": "my-agent",

"to": "+15559876543",

"text": "Reminder: your appointment is tomorrow at 3 PM."

}

Generates TTS audio and delivers as a phone call.


Call Transfer

{

"agentId": "my-agent",

"callSid": "CAxxxxxxxx",

"to": "+15551234567",

"announcementText": "Connecting you to a human agent now."

}

Voice Configuration

DEFAULT_VOICE_GREETING="Hello, how can I help you today?"

DEFAULT_VOICE_ID=EXAVITQu4vr4xnSDxMaL

DEFAULT_VOICE_LANGUAGE=en-US

All settings can be overridden per call via tool parameters.


Compliance

← Home