Amazon Introduces Nova Sonic: A Powerful New Voice AI Model

Tech Trove
4d
146
0
7

News

Nova Sonic

Amazon announced Amazon Nova Sonic, a new foundation model that unifies speech understanding and speech generation into a single model to enable more humanlike voice conversations in AI applications. Available via a new API in Amazon Bedrock, the model simplifies the development of voice applications, such as customer service call automation and AI agents, across a broad range of industries, including travel, education, health care, entertainment, and more.

What Makes Nova Sonic Special?

All-in-One Model: Combines speech-to-text, language understanding, and text-to-speech in a single system.

Faster & Easier Development: Developers don’t need to use separate tools, making voice apps quicker and simpler to build.

Human-like Conversations: Adjusts voice based on how the user talks—changing tone, speed, and emotion for more natural interaction.

USGB Language Support: Works with American and British English, covering various accents and speaking styles.

Noise-Friendly: Performs well even in noisy environments.

More Languages Coming Soon: Support for additional languages is on the way.

Real-Time Voice with Smarts

In a live demo, Nova Sonic was used in a customer support call. It not only understood the customer's request to change a phone plan but also pulled data from other systems and gave a clear, helpful response—all in real time. The AI tracked the conversation’s mood and showed live insights to help a support agent assist better.

It also handled interruptions smoothly, pausing when needed and continuing naturally—just like a real person would.

Nova Sonic AI

Easy to Start Using

Developers can enable Nova Sonic from the Amazon Bedrock console. It uses a bidirectional streaming API, which allows apps to send and receive audio at the same time. The model ID is amazon.nova-sonic-v1:0.

The system is event-driven, with inputs like voice streaming and tool results and outputs like live transcriptions and spoken replies.

Designed for Voice, Built with Safety

Amazon recommends creating short, friendly voice prompts instead of long or detailed text. The model is designed for natural spoken interactions, not reading or displaying information.

Nova Sonic also includes safety features such as content moderation and digital watermarking to help ensure responsible use.

AWS

Available Now

Nova Sonic is currently available in the US East (N. Virginia) region. It offers expressive male and female voices, supports conversations up to 8 minutes long, and can handle a large amount of information with its 32,000-token context limit.

This release marks another big step in Amazon’s goal to bring smarter, more human-like voice interactions to apps, websites, and devices.