Currently, there is no such thing as a single Microsoft speech service.
But Microsoft is taking the first steps toward creating a single speech application programming programming interface (API) and software development kit (SDK) that will work across its products and services, including Windows, Office, Cortana, Xbox and the HoloLens.
Microsoft disclosed this move last week in a rather understated way at its Build 2018 conference. (This Day 3 Build session on the "Cognitive Services Speech SDK"[1] covers some of the details.)
Microsoft has some ambitious goals for its coming unified Speech Service, which falls under its Microsoft Cognitive Services umbrella. (Cognitive services are Azure APIs that developers can use to add various AI capabilities to their own apps and services.)
The new unified Speech Service "unites several Azure speech services that were previously available separately[2]: Bing Speech (comprising speech recognition and text to speech), Custom Speech, and Speech Translation. Like its precursors, the Speech service is powered by the technologies used in other Microsoft products, including Cortana and Microsoft Office," according to Microsoft.
Microsoft is aiming to have the common speech API and SDK[3] "run on all modern platforms" and "support all modern programming languages." Microsoft wants the service to be accessible by all levels, from novice to expert developer, and to work online, offline, in hybrid situations and batch, officials said. The new API and SDK will provide speech-to-text; speech-to-intent; speech translation and custom keyword-spotter invocation. They will work with both single-shot spoken commands and continuous ones. Microsoft is committing to handle all 28 spoken languages in the one unified Speech SDK.
"We don't have all that today, but this (Speech preview) is a good first step," said Rob Chambers during last week's