NewVoxa Realtime v2 — sub-300ms response, every call.→

Voice AI agents that actually answer.

Voxa is the production-grade platform for building, deploying, and scaling voice AI agents. Connect any LLM, choose any voice, plug into your phone system — ship in days, not quarters.

Start free →▶ Listen to a live agent

All systems operationalSOC 2 · HIPAA · GDPRFree tier — no credit card

Latency218 ms

Intent acc.98.6%

Aria · Sales Qualifier

Inbound · English (US) · GPT-4o

Live · 00:24

Caller

Hi, I'm calling about your enterprise plan. Do you have a minute?

Aria

Of course — happy to help. Could I get your name and the size of your team so I can route you correctly?

Caller

Sure, it's Maya from Northwind. We're around 240 people.

The platform

Everything you need to ship a production voice agent.

Voxa is a complete stack — speech, models, telephony, observability, and orchestration — wired together so you focus on the conversation, not the plumbing.

Realtime, end-to-end.

A streaming pipeline that runs ASR, LLM, and TTS in parallel — interruptions, barge-in, and natural turn-taking handled at the framework level. No bespoke audio engineering required.

turn-detectionbarge-instreaming-ttsvadendpointing

Model agnostic.

Swap GPT-4o, Claude, Gemini, Llama, or your fine-tune in one line. Bring-your-own-key supported on every tier.

Telephony, built in.

Inbound and outbound calls over Twilio, Vonage, Telnyx, or SIP. Global numbers, transfers, and DTMF handled out of the box.

Function calling that works.

Hit any HTTP endpoint mid-conversation. Type-safe schemas, automatic retries, structured arguments — without prompt-engineering gymnastics.

Observability by default.

Every call recorded, transcribed, scored, and replayable. Funnel breakdowns, sentiment, drop-off — drill from a metric to the audio in two clicks.

Compliance and routing, global.

HIPAA, SOC 2, GDPR, and PCI workflows ship as policies. Pin data residency to a region; audit every transcript with cryptographic provenance.

HIPAASOC 2 Type IIGDPRPCI-DSSEU residencyBAA

How it works

From idea to live phone number — in four steps.

The path from prompt to production is short. No glue code, no infra, no orchestration to write yourself.

Configure

Write the prompt. Pick a voice and a model. Define the tools your agent can call.

~5 min

Connect

Attach a phone number, embed the web SDK, or wire it to your CRM via webhook.

~10 min

Test

Run scripted scenarios, score every turn, and iterate against a regression suite.

browser-based

Scale

Ship to thousands of concurrent calls. We handle scaling, failover, and routing.

multi-region

Developer first

A clean SDK.
Real types.
Boring infrastructure.

Voxa exposes a small, opinionated surface — Agents, Tools, and Calls. Everything else (audio buffers, codecs, RTP, retries, jitter) is handled for you.

→ Typed SDKs for TypeScript, Python, and Go
→ First-class WebRTC, WebSocket, and SIP transports
→ Local dev with hot-reloading prompts
→ Webhooks for every lifecycle event

Read the docs →

agent.ts · place an outbound call in 9 lines

import { Voxa } from "@voxa/sdk";
const voxa = new Voxa({ apiKey: process.env.VOXA_KEY });
const call = await voxa.calls.create({
  to: "+14155551212",
  agent: "aria-sales-qualifier",
  voice: "luna-natural",
  model: "gpt-4o-realtime",
  tools: ["crm.lookup", "calendar.book"],
  maxDurationSec: 600,
});
class="c">// → call.id, call.status, call.transcriptUrl

Use cases

Built for the conversations that move your business.

Teams across sales, support, healthcare, and operations are running Voxa agents in production today.

Sales

Inbound qualification

Answer every inbound lead in under a second. Qualify, score, and route to the right rep — with the full transcript already in your CRM.

Avg pickup0.6s

Support

Tier-1 deflection

Resolve account, billing, and how-to questions without a human. Hand off cleanly with full context when escalation is needed.

Resolution rate71%

Operations

Outbound scheduling

Book, confirm, or reschedule appointments at scale. Plays nicely with Google Calendar, Outlook, and any booking API.

Calls / hour2,400+

Healthcare

Patient intake

HIPAA-compliant intake calls that capture insurance, history, and chief complaint — written straight into your EHR.

Form completion94%

Recruiting

Phone screens

First-round screens that go beyond keyword matching. Structured rubric, calibrated scoring, recording for the hiring manager.

Time-to-screen−83%

Field

Surveys & research

Adaptive interviews — branching follow-ups, sentiment-aware probing — that get past the surface answer.

Completion lift+38%

The voice library

120+ voices. Every accent. Every register.

Hand-curated and continuously refreshed. Clone your own, or pick from the studio. All voices stream at <100ms first-byte.

browse all 120+ voices in studio

Integrations

Plug into your stack — or replace it.

Voxa is a hub. Bring your own model, your own carrier, your own CRM. Or use the defaults and ship today.

All integrations →

modelOpenAI

modelAnthropic

modelGemini

modelLlama

ttsElevenLabs

ttsCartesia

asrDeepgram

asrWhisper

telephonyTwilio

telephonyVonage

telephonyTelnyx

telephonySIP

crmSalesforce

crmHubSpot

dataSnowflake

eventsZapier

In production

Teams ship voice on Voxa in days.

“

We replaced a six-month roadmap with a Voxa agent that shipped in nine days. Pickup time on inbound went from a minute to under a second.

Maya Anand

VP Engineering · Northwind

“

Latency was the one number that mattered. Voxa is the first platform that didn't make us choose between speed and quality of the conversation.

Jonas Reyes

Head of Product · Helios

“

The observability stack alone is worth the migration. We can replay any call, score it against our rubric, and ship a fix the same afternoon.

Priya Trivedi

Director of CX · Octane

Pricing

Pay only for the minutes you talk.

Transparent pricing. No seat licenses. No proof-of-concept fees. Bring your own keys to drop costs further.

Starter

For prototypes and low-volume internal tools.

$0/ mo + usage

Start free

500 minutes / month included
$0.07 / minute after
10 concurrent calls
Community Discord
Hosted dashboard + transcripts

Answers to what you're about to ask.

How fast is "fast" in real terms?

Median voice-to-voice latency is 218ms — measured from the end of the caller's speech to the first audible byte from the agent. p99 is under 480ms. We publish live latency at status.voxa.dev.

Can I bring my own model?

Yes — connect any OpenAI-compatible endpoint, including self-hosted Llama and fine-tunes. Bring-your-own-key is supported on every tier; usage doesn't count against your Voxa minutes.

How does Voxa handle interruptions?

A streaming VAD detects barge-in within ~80ms and the TTS pipeline aborts cleanly. The agent rolls back its turn, listens, and resumes — without the awkward overtalk you hear on other platforms.

Is my data used to train models?

No. Customer data is never used to train base or fine-tuned models. You can opt into your own private fine-tuning pipeline; nothing leaves your tenant by default.

What about HIPAA and PCI?

Voxa offers a HIPAA-compliant deployment with a signed BAA on Growth and Enterprise. PCI-DSS scope is reduced via tokenized DTMF capture and pause-and-resume recording.

Can I export call data?

Every transcript, recording, and event stream is available via API and S3 export. Native sinks for Snowflake, BigQuery, and Datadog are included.

How long until I'm in production?

Most teams are live with their first agent in under a week. Enterprise customers with custom telephony or compliance requirements typically take 2–3 weeks with our solutions team.