Blog Post

AI - Azure AI services Blog
4 MIN READ

Easily add voice commands to your apps with Custom Commands

Vishesh Oberoi's avatar
Jul 08, 2020

Voice-enabled assistants—that enable users to search, ask questions, complete tasks, and much more—have been gaining increasing momentum and becoming integrated into consumer’s daily lives. Voice enables more seamless, natural interfaces, providing a more intuitive way of interacting with technology. In our current environment, voice and contactless experiences will play an increasingly important role, with the United States alone already seeing a 20 percent increase in preference for contactless operations (McKinsey 2020).

 

We saw some of this vision at Microsoft Build 2020 earlier this year where CTO Kevin Scott discussed emergent trends on the path to reshape software development, including the convergence of physical and digital worlds. One example centered around voice tech for food delivery, in which Boston Dynamics’ Spot robot completed curb-side deliveries using voice interaction.

 

 

We are committed to enabling developers and designers to build innovative voice-enabled solutions. To help make it easier to build voice commanding applications, today, we’re excited to announce the general availability of Custom Commands. Custom Commands is a capability of Speech in Azure Cognitive Services, that streamlines the process for creating task-oriented voice applications, providing a unified authoring experience with relatively lower complexity, helping you focus on building the best solution for your voice commanding scenarios.

 

Voice applications such as voice assistants listen to users and take an action in response. They involve transcribing the user's speech, taking action on the text using natural language processing, and using voice to respond with text-to-speech. Custom Commands brings together the best of Speech and Language in Azure Cognitive Services—Speech to Text for speech recognition, Language Understanding for capturing spoken entities with speech adaptation, and voice response with Text to Speech, to accelerate the addition of voice capabilities to your apps iteratively and with low-code authoring experience.

 

Custom Commands is best suited for task completion or command-and-control scenarios that have a well-defined set of variables. In addition to the voice-activated delivery example with Boston Dynamics’ Spot, Custom Commands supports solutions in a variety of verticals including hospitality, automotive and retail. For example, you can build in-room voice-controlled experiences for your guests, enable in-vehicle communication and entertainment systems, or manage store inventory with an ambient smart speaker.

 

 

Building Voice Assistants

 

Speech in Azure Cognitive Services provides solutions for building voice assistants that are tailored for your use case. Custom Commands streamlines the process for creating voice-enabled apps for simple task completion (with a fixed vocabulary and defined set of variables).

 

For some scenarios, you may also need a solution that handles more complex conversational interactions. For flexible voice assistants designed for open-ended conversational scenarios, Direct Line Speech enables you to build a robust solution that is optimized for voice-in, voice-out interaction with bots.

 

Here’s a sample reference architecture for an end-to-end voice assistant supported by Custom Commands:

 

Customization and Extensibility

 

With Custom Commands, our goal is to simplify the process of creating a unique voice-first experience that reflects your brand. You can configure multiple commands for commanding or task completion, add parameters and conditions to a particular task before completion, or configure interaction rules to handle confirmation prompts or one-step correction to help disambiguate.

 

Publish your app and integrate it with any client app using the Speech SDK. You can follow our documentation to integrate using the Speech SDK for C#, or build your own using our Speech SDK, which is available in multiple languages on various platforms.

 

It is very easy to integrate your app with the Speech SDK. Start by specifying the applicationId, subscriptionKey and region.

// Your application id
const string customCommandsApplicationId = "YourApplicationId"; 
// Your subscription key
const string customSubscriptionKey = "YourSpeechSubscriptionKey";
// The subscription service region.
const string region = "YourServiceRegion"; 

var customCommandsConfig = CustomCommandsConfig.FromSubscription(customCommandsApplicationId, customSubscriptionKey, region);

Then configure your client app to receive activity from the Custom Commands app.

// Implement event handlers
connector.ActivityReceived += (sender, activityReceivedEventArgs) => { ... };

Once you’ve published your app, consider adding a Custom Keyword to your app. Custom Keyword allows your product to be voice activated with a word or short phrase (for example, “Hey Cortana” is the keyword for the Cortana assistant).

 

Keywords generated using Custom Keyword can be easily integrated with your device or application via the Speech SDK. Note that audio only starts streaming to the cloud (for verification that the user said the keyword) after the keyword has been detected locally on the user’s device.

// Start listening for keyword
var model = KeywordRecognitionModel.FromFile("YourKeywordModelFileName");
connector.StartKeywordRecognitionAsync(model);

 

Finally, bring your app to life with natural-sounding voices using Text to Speech. You can either use one of our 100+ out-of-the-box voices, or create a custom voice for your brand.

 

In addition, we provide comprehensive support for your development workflow, including the ability to import/export your app and integrate with continuous deployment pipelines.

 

We’re excited to see what you’ll build with Custom Commands.

 

 

Get started today

 

Get started today and check out our demos at https://speech.microsoft.com/customcommands

 

Learn more with our documentation: Custom Commands documentation

Follow the Quickstart: Create a voice assistant using Custom Commands

Check out easy-to-deploy samples: Voice Assistants GitHub repository

See the video tutorial: Building Voice Assistants using Custom Commands

Updated Jul 08, 2020
Version 2.0