
High quality Speech Recognition is now available
We are happy to announce the high quality speech recognition for both audio call records transcription and real-time recognition scenarios.

We are happy to announce the high quality speech recognition for both audio call records transcription and real-time recognition scenarios.

The new version of Web SDK will help us to accelerate the development process and includes a lot of new features and improvements.

Now developers can use Promise in their VoxEngine scenarios and we also added Net.httpRequestAsync and Net.sendMailAsync functions.

Your mp3 or ogg files played on VoxEngine scenario level with call.startPlayback or using Player will be played on the Web or Mobile SDK side in HD quality (48KHz), or on SIP side if it does support wideband audio codecs (Speex or Opus).

In HD mode audio is being mixed at 48KHz, all audio sources with lower sample rate will be resampled to 48KHz.

We chose 48 KHz as the base sample rate for HD audio recorder, since WebRTC/Opus can offer this quality, audio from endpoints with lower sample rate will be re-sampled.

Full Featured Instant Messaging

If a call is made in non-P2P mode then its media stream goes via our media servers and we can record it if required.

We've started with audio, then we've added video calls and now it's time to let our developers use instant messaging and presence - two very important features of UC stack.

The new version of our mobile SDK uses WebRTC engine for audio/video processing and supports all features available for WebSDK.

Victor Pascual from Quobis invited us to participate in WebRTC meetup that took place on March 4th in Barcelona, we accepted the invitation and I'm really happy that we did.

Now there is a way to restrict access to VoxImplant HTTP API and only allow it for certain IP addresses or networks when api_key is being used.

New integrations for Voice AI have arrived: Google's Gemini 2.0 Flash model, featuring seamless voice-to-voice conversation capabilities and ElevenLabs low-latency streaming speech synthesis are now available for Voximplant developers

Today Ultravox announced they are directly integrating Voximplant into their platform to provide SIP capabilities. The integration builds on Voximplant’s deep telephony and Voice AI tooling

Voximplant has added a WebSocket privacy option that redacts message payloads from logs across all WebSocket-based services – Voice AI connectors and external speech system – and speech control modules

Check out the latest useful Voximplant Kit updates — we developed chat analytics, improved call history, added new tools for supervisors, expanded scenario capabilities, and updated the softphone. Below is a brief overview of the essential enhancements.

Voximplant now includes a native Deepgram module that connects any Voximplant call to Deepgram’s Voice Agent API for real-time, speech‑to‑speech conversations. You can stream audio from phone numbers, SIP trunks, WhatsApp, or WebRTC into Deepgram’s unified agent environment—combining STT, LLM reasoning, and TTS—and play responses via Voximplant’s serverless runtime with minimal latency.

Voximplant has new realtime speech generation for voice AI from Inworld, our latest Voice AI text-to-speech (TTS) partner. Together, we combine state-of-the-art TTS with carrier-grade connectivity so you can build voice agents that sound like your brand, not a generic robot.

Voximplant now includes a native Cartesia Line / Agents connector that connects any Voximplant call to a Cartesia Line voice agent for real-time, speech-to-speech conversations—over PSTN, SIP, WebRTC, or WhatsApp Business Calling—without building custom media gateways or WebSocket streaming infrastructure.

Voximplant now includes a native Grok module that connects any Voximplant call to xAI’s Grok Voice Agent API for real-time, speech-to-speech conversations. With a single VoxEngine scenario, you can interact via audio with Grok over phone numbers, SIP trunks and infrastructure, WhatsApp Business, or WebRTC into Grok — all without building custom media gateways or WebSocket streaming infrastructure.