Description
This article provides some elementary information about how to perform text to speech conversion using the speech SDK 5.1. Speech SDK 5.1 is the latest release in the speech product line from Microsoft. Speech SDK 5.1 can be used in various programming languages.
Introduction
Speech is one of the most natural way to interact. When it comes to computers it is no different. If an application can be controlled solely by way of voice commands then the opportunity that lies is unlimited. Even though the idea of using speech as an input mechanism for an application is not new there are not a lot of applications that use speech as in input. In other words speech is still an big opportunity that is yet to be explored.
Microsoft speech SDK is one of the many tools that enable a developer to add speech capability in to a applications. Speech SDK can be used in either C#, C++, VB or any COM compliant language.
Broadly, speech can be divided in to two paradigms. Text to speech conversion and speech recognition. In this article I shall be focusing on the Text to speech conversion.
Converting text to speech using speech SDK consists of a few simple steps. The following code shows the important pieces in performing text to speech. The below code is an implementation of the button click event.
private void button1_Click(object sender, System.EventArgs e)
{
SpVoice objSpeech = new SpVoice();
objSpeech.Speak(textBox1.Text,SpeechVoiceSpeakFlags.SVSFlagsAsync);
objSpeech.WaitUntilDone(Timeout.Infinite);
}
SpVoice is the class that is used for text to speech conversion. The speak method takes in a string that needs to be spoken along with a set of flags. The flag that I have used, which is SVSFlagsAsync tells the TTS engine that the conversion of the text to speech needs to be done in a Asynchronous mode. The control quickly returns after this call.
Microsoft speech SDK comes with a few default text-to-speech engines. A developer can select a text to speech engine using the Text-To-Speech property pane in the speech item in the control panel. Below is a screen shot of the speech property pane.
As shown in the above screen shot, there are various voice profiles to choose from. The voices are either masculine or feminine. The preview voice button will allow you to preview the voice.
The SpVoice class has a property called AudioOutputStream which could be used to store the voice file if required.
Summary
This article gives an introduction to text to speech conversion using the Speech SDK 5.1.