Supported formats are raw audio and FLAC format, while MP3 and AAC are not accepted. The file to recognise can be provided both by including the audio signal into the HTTP request payload (encoded with Base64) or by giving the URI of the file (currently, only Google Storage can be used). Optionally, it can be requested to return multiple alternatives in addition to the best-matching, each one with the estimated accuracy. The batch processing is very straightforward just by providing the audio file to process and describing its format the API returns the best-matching text, together with the recognition accuracy. The API, still in alpha, exposes a RESTful interface that can be accessed via common POST HTTP requests. An Outline of the Google Cloud Speech API Now that such technology will be accessible as a cloud service to developers, it will allow any application to integrate speech-to-text recognition, representing a valuable alternative to the common Nuance technology (used by Apple’s Siri and Samsung’s S-Voice, for instance) and challenging other solutions such as the IBM Watson speech-to-text and the Microsoft Bing Speech API. ![]() Speech-to-text features are used in a multitude of use cases including voice-controlled smart assistants on mobile devices, home automation, audio transcription, and automatic classification of phone calls. The neural network is updated as new speech samples are collected by Google, so that new terms are learned and the recognition accuracy keeps on increasing. The capability to convert voice to text is based on deep neural networks, state-of-the-art machine learning algorithms recently demonstrated to be particularly effective for pattern detection in video and audio signals. This speech recognition technology has been developed and already used by several Google products for some time, such as the Google search engine where there is the option to make voice search. Google recently opened its brand new Cloud Speech API – announced at the NEXT event in San Francisco – for a limited preview. Discover the Strengths and Weaknesses of Google Cloud Speech API in this Special Report by Cloud Academy’s Roberto Turrin
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |