Microsoft AI School - Using AI to Detect Speech

Allen Oneill
1y
6.8k
0
7

Article

Microsoft AI School - Using AI to Detect Speech

Speech Service in Azure - An Overview

The Speech service in Azure is an integration of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription that enables you to build speech-enabled applications. The Speech service supports the following APIs:

Speech-to-Text: An API that facilitates speech recognition in which your application can accept and translate audio streams or spoken inputs.
Text-to-Speech: An API that facilitates speech synthesis in which your application can generate speech from text.
Speech Translation: An API that enables your applications to translate spoken input into multiple languages. You can use this service for speech-to-speech and speech-to-text translation.
Speaker Recognition: An API that allows your application to verify and identify individual speakers based on their unique voice characteristics.
Intent Recognition: An API that can be integrated with the Language Understanding service to determine the semantic meaning of a spoken input.

Microsoft AI School features an amazing learning path to help you learn and develop speech-enabled applications with the help of the Speech service in Azure. After learning to work with the Speech service you can:

Build voice-enabled apps quickly with help of the Speech SDK.
Transcribe speech-to-text
Produce natural text-to-speech voices
Create custom models to integrate with your applications with Speech studio.

Feel free to explore the entire course/path here: “Process and Translate Speech with Azure Cognitive Speech Services.” This learning path consists of two modules. Read ahead to get a quick peek at each module of this learning path.

Prerequisites:

Before starting this learning path, you should:

Be able to navigate the Azure portal and be familiar with the Azure services.
Have some programming experience in C# or Python.

Module 1: Create Speech-Enabled Apps with The Speech Service

The first module focuses upon educating you about the speech-to-text and text-to-speech APIs that help you to build applications capable of speech recognition and speech synthesis. The units in this module will introduce you to the important concepts of the Speech service and how to use its API through supported SDKs (software development kits ). The module will also guide you to try the Speech service for yourself with the help of a hands-on exercise.

Learning objectives:

In this module you will learn how to provision an Azure resource for the Speech service and then you will move on to learn how to:

Implement speech recognition using the Speech-to-Text API.
Implement speech synthesis using the Text-to-Speech API.
Configure audio format and voices.
Use Speech Synthesis Markup Language.

Here’s an overview of the units covered within this module:

Module 2: Translate speech with the Speech service

The second module focuses upon teaching you speech translation using the Speech service in Azure. Speech translation builds on the speech recognition capabilities by identifying and transcribing spoken inputs in a specified language and then returning the translations of the transcription in one or more languages.

The units in the module will introduce you to the important concepts of the Speech service and how to use its API through supported SDKs. After learning the essential concepts you will be able to try and test the Speech service in Azure for yourself with the help of a hands-on exercise.

Learning objectives:

In this module you will learn how to provision Azure resources for speech translation then you will move on to learn how to:

Generate text translation from speech.
Synthesize spoken translations.

Here’s an overview of the units covered within this module:

Conclusion

Thus, if you are willing to learn the concepts and dive deep into speech translation in Azure with the help of real-time examples and exercises, then this learning path will serve as the perfect roadmap for you. You will learn how to build applications using the Speech cognitive service in Azure which offers industry leading speech capabilities such as speech translation, speaker recognition, speech-to-text and text-to-speech.