SVC—001 ● LIVE VOICE AI JAN–MAR 2025

WebRTC
Voice
Agent

GPT-4o Realtime

An enterprise-grade phone agent — not a chatbot. Handles real-time voice conversations over WebRTC, maintains contextual memory across the call, and syncs every interaction bidirectionally into HubSpot CRM. Built for production traffic with sub-800ms end-to-end latency from speech input to voiced response.

95%
ACCURACY
<800ms
LATENCY
+45%
ENGAGEMENT
GPT-4o Realtime WebRTC Twilio ElevenLabs HubSpot API AWS Lambda Redis
// UI PREVIEW — LIVE CALL OPERATIONS DASHBOARD
VOICE AGENT SYSTEM Live Call Dashboard
Call in progress
CALL STATUS
STATUS● ACTIVE
DURATION04:32
CHANNELWebRTC / Twilio
LATENCY612ms
MODELgpt-4o-realtime
TTSElevenLabs
MEMORYLoaded (7 turns)
INTENTproduct_inquiry
LIVE TRANSCRIPT
AGENT
Hi there, this is Alex from the support team. How can I help you today?
CALLER
Yes, I wanted to ask about the enterprise pricing for your platform.
AGENT
Absolutely. Based on your company size I can walk you through our Team and Enterprise tiers. Would you prefer a demo first?
CALLER
A demo would be great, yes. Next week works.
AGENT
Perfect — I'm scheduling that now. You'll get a confirmation at the email on file.
CRM SYNC LOG
Contact record updated — intent: enterprise_inquiry
04:12
Lead score raised: 42 → 87
04:18
Meeting scheduled: Demo — Next Tue 3:00 PM
04:29
Webhook fired → HubSpot deal pipeline stage: Qualified
04:30
Email confirmation queued via n8n workflow
04:31

The Pipeline

CALLER
Twilio
WebRTC
FastAPI
GPT-4o
Realtime
ElevenLabs
TTS
HubSpot
CRM
STEP 01
WebRTC Connect
Caller dials in via Twilio. WebRTC stream established to FastAPI backend over secure WSS channel.
STEP 02
Audio Processing
Raw audio chunks streamed in real-time to GPT-4o Realtime API. No intermediate STT step — native voice understanding.
STEP 03
Context & Memory
Redis session store injects prior conversation history. Agent maintains context across the full call duration.
STEP 04
TTS Response
GPT-4o response streamed to ElevenLabs for low-latency neural TTS synthesis. Audio piped back to caller in under 800ms.
STEP 05
CRM Sync
Post-call: intent, lead score, scheduled meetings, and full transcript written bidirectionally to HubSpot via webhook.
03 // STACK

Built with

VOICE MODEL
GPT-4o Realtime
Native real-time audio API. Understands speech directly without intermediate STT, enabling natural voice-to-voice flow.
TRANSPORT
WebRTC + Twilio
WebRTC handles peer-to-peer audio streaming. Twilio bridges PSTN callers to the WebRTC infrastructure.
TEXT-TO-SPEECH
ElevenLabs
Neural TTS with custom voice cloning. Streaming output to keep end-to-end latency under 800ms.
MEMORY
Redis
Session-scoped conversation memory. Persists context between turns so the agent never loses track of the call.
CRM
HubSpot API
Bidirectional sync — reads prior contact history before the call, writes outcomes, scores, and meetings after.
INFRASTRUCTURE
AWS Lambda
Serverless backend for webhook handling and async CRM writes. Auto-scales under production traffic spikes.
See the rest
of the deployments.
← ALL PROJECTS