Back in October 2017, DeepMind Artificial Intelligence startup became a part of the Alphabet, and its WaveNet technology which is a deep neural network for generating raw audio waveforms is used in producing better and more natural sounding speech for Google Assistant in English and Japanese languages. Google today announced that it is bringing this technology to Google Cloud Platform with Cloud Text-to-Speech open for developers or business that needs voice synthesis on tap, whether that’s for an app, website, or virtual assistant. The company says that it can be used in a variety of ways including power voice response systems for call centers (IVRs), enabling real-time natural language conversations, have IoT devices talk back to you, convert text into the spoken format like audiobooks, etc. There are 32 basic voices to choose from, across 12 languages. Cloud Text-to-Speech correctly pronounces complex text such as names, dates, times and addresses for authentic-sounding and it also allows you to customize pitch, speaking rate, and volume gain and supports a variety of audio formats, including MP3 and WAV. The improved WaveNet model generates audio 1,000 times faster than the original model, fidelity has also been increased to 24,000 samples per second, and the resolution has been bumped up from 8 to 16 bits all of which producing ...
Read Here»
Subscribe to:
Post Comments (Atom)
Post a Comment Blogger Facebook
We welcome comments that add value to the discussion. We attempt to block comments that use offensive language or appear to be spam, and our editors frequently review the comments to ensure they are appropriate. As the comments are written and submitted by visitors of The Sheen Blog, they in no way represent the opinion of The Sheen Blog. Let's work together to keep the conversation civil.