Getting started with Text to Speech
The IBM Watson® Text to Speech service converts written text to natural-sounding speech to provide speech-synthesis capabilities for applications. This curl
-based tutorial can help you get started quickly with the service. The examples
show you how to call the service's POST
and GET /v1/synthesize
methods to request an audio stream.
The tutorial uses the curl
command-line utility to demonstrate REST API calls. For more information about curl
, see Using curl with Watson examples.
IBM Cloud Watch the following video for a visual summary of getting started with the Text to Speech service.
Before you begin
IBM Cloud
IBM Cloud
This tutorial uses an API key to authenticate. In production, use an IAM token. For more information see Authenticating to IBM Cloud.
IBM Cloud Pak for Data
IBM Cloud Pak for Data
The Text to Speech for IBM Cloud Pak for Data must be installed and configured before beginning this tutorial. For more information, see Watson Speech services on Cloud Pak for Data.
- Create an instance of the service by using the web client, the API, or the command-line interface. For more information about creating a service instance, see Creating a Watson Speech services instance.
- Follow the instructions in Creating a Watson Speech services instance to obtain a Bearer token for the instance. This tutorial uses a Bearer token to authenticate to the service.
Synthesize text in US English
The following command use the POST /v1/synthesize
method to synthesize US English input to audio. The request uses the voice en-US_MichaelV3Voice
. It produces audio in the WAV format.
You can use a browser or other tools to play the audio files that are produced by the examples in this tutorial. For more information, see Playing an audio file.
-
Issue the following command to synthesize the string "hello world". The request produces a WAV file that is named
hello_world.wav
.IBM Cloud
curl -X POST -u "apikey:{apikey}" \ --header "Content-Type: application/json" \ --header "Accept: audio/wav" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.wav \ "{url}/v1/synthesize?voice=en-US_MichaelV3Voice"
IBM Cloud Pak for Data
- Replace
{token}
and{url}
with the access token and URL for your service instance.
curl -X POST \ --header "Authorization: Bearer {token}" \ --header "Content-Type: application/json" \ --header "Accept: audio/wav" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.wav \ "{url}/v1/synthesize?voice=en-US_MichaelV3Voice"
- Replace
Use a different voice and audio format
The following command again uses the POST /v1/synthesize
method to synthesize the same US English input to audio. But this request uses the voice en-US_AllisonV3Voice
and explicitly requests audio in the default Ogg
format.
-
Issue the following command to synthesize the string "hello world" but with a different voice. The request produces an Ogg file that is named
hello_world.ogg
.IBM Cloud
curl -X POST -u "apikey:{apikey}" \ --header "Content-Type: application/json" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.ogg \ "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"
IBM Cloud Pak for Data
- Replace
{token}
and{url}
with the access token and URL for your service instance.
curl -X POST \ --header "Authorization: Bearer {token}" \ --header "Content-Type: application/json" \ --header "Accept: audio/wav" \ --data "{\"text\":\"hello world\"}" \ --output hello_world.wav \ "{url}/v1/synthesize?voice=en-US_AllisonV3Voice"
- Replace
Synthesize text in Spanish
The following command uses the GET /v1/synthesize
method to synthesize Spanish input to an audio file. The GET
method includes three query parameters: accept
to specify the audio format, text
to specify the input text for the audio, and voice
to specify a Spanish voice. Because accept
and text
are passed as query parameters, the request is URL-encoded.
-
Issue the following command to synthesize the string "hola mundo" and produce a WAV file that is named
hola_mundo.wav
.IBM Cloud
curl -X GET -u "apikey:{apikey}" \ --output hola_mundo.wav \ "{url}/v1/synthesize?accept=audio%2Fwav&text=hola%20mundo&voice=es-ES_EnriqueV3Voice"
IBM Cloud Pak for Data
- Replace
{token}
and{url}
with the access token and URL for your service instance.
curl -X POST \ --header "Authorization: Bearer {token}" \ --output hola_mundo.wav \ "{url}/v1/synthesize?accept=audio%2Fwav&text=hola%20mundo&voice=es-ES_EnriqueV3Voice"
- Replace
Next steps
- To try an example application that accepts text and generates speech with different voices, see the Text to Speech demo.
- For more information about the service's interfaces and features, see Service features.
- For more information about all methods of the service's interfaces, see the API & SDK reference.