Vovsoft Speech to Text Converter requires IBM Cloud Speech to Text API which can convert up to 500 minutes per month for free.
In order to get your API key and API URL, please follow these steps:
Enter your API key and URL into the Settings panel inside "Vovsoft Speech to Text Converter". The software is now ready to convert audio to text.
English, Arabic, Chinese (Mandarin), Czech, Dutch, French, German, Hindi (Indian), Italian, Japanese, Korean, Portuguese (Brazilian) and Spanish are supported.
For most languages, the IBM Cloud service supports broadband, narrowband, telephony and multimedia models:
Choosing the correct model is important. Use the model that matches the sampling rate (and language) of your audio. The service automatically adjusts the sampling rate of your audio to match the model that you specify.
Conversion times are listed in the table below. Please note that the specified times vary depending on the content of the file, its quality, language model, load of the AI servers and your computer's upload speed.
|Audio Length||Audio Quality||Language Model||Approximate Conversion Time|
|5 minutes||48 kHz Stereo||English (Broadband)||1 minute and 20 seconds|
|5 minutes||8 kHz Mono||English (Narrowband)||1 minute and 30 seconds|
|30 minutes||48 kHz Stereo||English (Broadband)||9 minutes|
|30 minutes||8 kHz Mono||English (Narrowband)||10 minutes|
HTTP/1.1 503 Service Unavailable
Your URL is wrong. Please enter the exact "API Key" and "API URL" that was supplied for you by IBM Cloud.
Error reading data: (12152)
Your audio is too long. Please try to convert a shorter audio.
"Please wait" hangs, nothing happens
The file size of your audio is too large. Please try to upload a smaller file. Converting stereo to mono may help.