How to build a voice-enabled grocery chatbot with Azure AI
Published Jan 25 2021 04:25 PM 6,427 Views
Microsoft

Chatbots have become increasingly popular in providing useful and engaging experiences for customers and employees. Azure services allow you to quickly create bots, add intelligence to them using AI, and customize them for complex scenarios.

In this blog, we’ll walk through an exercise which you can complete in under two hours, to get started using Azure AI Services. This intelligent grocery bot app can help you manage your shopping list using voice commands. We’ll provide high level guidance and sample code to get you started, and we encourage you to play around with the code and get creative with your solution!

Features of the application:

iPhoneview.png

  • Add or delete grocery items by dictating them to Alexa.
  • Easily access the grocery list through an app.
  • Check off items using voice commands; for example, “Alexa, remove Apples from my grocery list."
  • Ask Alexa to read the items you have in your grocery list.
  • Automatically organize items by category to help save time at the store.
  • Use any laptop or Web Apps to access the app and sync changes across laptop and phone.

Prerequisites:

 

Key components:

Solution Architecture

App Ref Architecture.png

App Architecture Description: 

  • The user accesses the chatbot by invoking it as an Alexa skill.
  • User is authenticated with Azure Active Directory.
  • User interacts with the chatbot powered by Azure Bot Service; for example, user requests bot to add grocery items to a list.
  • Azure Cognitive Services process the natural language request to understand what the user wants to do. (Note: If you wanted to give your bot its own voice, you can choose from over 200 voices and 54 languages/locales. Try the demo to hear the different natural sounding voices.)
  • The bot adds or removes content in the database.

Another visual of the flow of data within the solution architecture is shown below.

App flow.png

 

 

 

 

 

 

Implementation

High level overview of steps involved in creating the app along with some sample code snippets for illustration:

We’ll start by creating an Azure Bot Service instance, and adding speech capabilities to the bot using the Microsoft Bot Framework and the Alexa skill. Bot Framework, along with Azure Bot Service, provides the tools required to build, test, deploy, and manage the end-to-end bot development workflow. In this example, we are integrating Azure Bot Service with Alexa, which can process speech inputs for our voice-based chatbot. However, for chatbots deployed across multiple channels, and for more advanced scenarios, we recommend using Azure’s Speech service to enable voice-based scenarios. Try the demo to listen to the over 200 high quality voices available across 54 languages and locales.

  1. The first step in the process is to login into Azure portal and follow the steps here to create an Azure Bot Service resource and a web app bot. To add voice capability to the bot, click on channels to add Alexa (see the below snapshot) and note the Alexa Service Endpoint URI.

 Azure Bot Service ChannelsAzure Bot Service Channels

  1. Next, we need to log into the Alexa Developer Console and create an Amazon Alexa skill. After creating the skill, we are presented with the interaction model. Replace the JSON Editor with the below example phrases.

 

 

 

{

  "interactionModel": {

    "languageModel": {

      "invocationName": "Get grocery list",

      "intents": [

        {

          "name": "AMAZON.FallbackIntent",

          "samples": []

        },

        {

          "name": "AMAZON.CancelIntent",

          "samples": []

        },

        {   

          "name": "AMAZON.HelpIntent",

          "samples": []

        },

        {

          "name": "AMAZON.StopIntent",

          "samples": []

        },

        {

          "name": "AMAZON.NavigateHomeIntent",

          "samples": []

        },

        {

          "name": "Get items in the grocery",

          "slots": [

            {

              "name": "name",

              "type": "AMAZON.US_FIRST_NAME"

            }

          ],

          "samples": [

            "Get grocery items in the list",

            "Do I have bread in my list",

           ]

        }

      ],

      "types": []

    }

  }

}

 

 

 

 

  1. Next, we’ll integrate the Alexa Skill with our Azure bot. We’ll need two pieces of information to do this: the Alexa Skill ID and the Alexa Service Endpoint URI. First, get the Skill ID either from the URl in the Alexa portal, or by going to the Alexa Developer Console and clicking “view Skill ID”. The skill ID should be a value like ‘amzn1.ask.skil.A GUID’. Then, get the Alexa Service Endpoint URI from the Azure portal, by going to the channels page of our Azure Web App Bot in the Azure portal, and clicking on Alexa to copy the Alexa Service Endpoint URI. Then integrate as shown:

 

  • Amazon Developer Console: After building the Alexa Skill, click on Endpoint and paste the Alexa Service Endpoint URI that we copied from the Azure portal and save the Endpoints.
    Amazon Developer Console.jpg
  • Azure Portal: Go to the channels page of the Azure Bot, click on Alexa, and paste the Alexa Skill ID that we copied from the Alexa Developer Console.
    Alexa config settings in Azure bot service.jpg

 

  1. Now, we’ll download and the bot locally for testing using the Bot Framework Emulator. Click on “Build” in the Azure Web Bot app to download the source code locally with Bot Framework Emulator. Modify app.py as below:
    # Copyright (c) Microsoft Corporation. All rights reserved.
    # Licensed under the MIT License.
    
    from http import HTTPStatus
    
    from aiohttp import web
    from aiohttp.web import Request, Response, json_response
    from botbuilder.core import (
        BotFrameworkAdapterSettings,
        ConversationState,
        MemoryStorage,
        UserState,
    )
    from botbuilder.core.integration import aiohttp_error_middleware
    from botbuilder.schema import Activity
    
    from config import DefaultConfig
    from dialogs import MainDialog, groceryDialog
    from bots import DialogAndWelcomeBot
    
    from adapter_with_error_handler import AdapterWithErrorHandler
    
    CONFIG = DefaultConfig()
    
    # Create adapter.
    # See https://aka.ms/about-bot-adapter to learn more about how bots work.
    SETTINGS = BotFrameworkAdapterSettings(CONFIG.APP_ID, CONFIG.APP_PASSWORD)
    
    # Create MemoryStorage, UserState and ConversationState
    MEMORY = MemoryStorage()
    USER_STATE = UserState(MEMORY)
    CONVERSATION_STATE = ConversationState(MEMORY)
    
    # Create adapter.
    # See https://aka.ms/about-bot-adapter to learn more about how bots work.
    ADAPTER = AdapterWithErrorHandler(SETTINGS, CONVERSATION_STATE)
    
    # Create dialogs and Bot
    RECOGNIZER = IntelligentGrocery(CONFIG)
    grocery_DIALOG = groceryDialog()
    DIALOG = MainDialog(RECOGNIZER, grocery_DIALOG)
    BOT = DialogAndWelcomeBot(CONVERSATION_STATE, USER_STATE, DIALOG)
    
    # Listen for incoming requests on /api/messages.
    async def messages(req: Request) -> Response:
        # Main bot message handler.
        if "application/json" in req.headers["Content-Type"]:
            body = await req.json()
        else:
            return Response(status=HTTPStatus.UNSUPPORTED_MEDIA_TYPE)
    
        activity = Activity().deserialize(body)
        auth_header = req.headers["Authorization"] if "Authorization" in req.headers else ""
    
        response = await ADAPTER.process_activity(activity, auth_header, BOT.on_turn)
        if response:
            return json_response(data=response.body, status=response.status)
        return Response(status=HTTPStatus.OK)
    
    APP = web.Application(middlewares=[aiohttp_error_middleware])
    APP.router.add_post("/api/messages", messages)
    
    if __name__ == "__main__":
        try:
            web.run_app(APP, host="localhost", port=CONFIG.PORT)
        except Exception as error:
            raise error
    ​
  2. Next, we’ll run and test the bot with Bot Framework Emulator. From the terminal, navigate to the code folder and run pip install -r requirements.txt to install the required packages to run the bot. Once the packages are installed, run python app.py to start the bot. The bot is ready to test as shown below:
    BF Emulator test.jpg}

    Open the bot and add the below port number into the following URL.
    Bot Framework Emulator viewBot Framework Emulator view

 

  1. Now we’re ready to add natural language understanding so the bot can understand user intent. Here, we’ll use Azure’s Language Understanding Cognitive Service (LUIS), to map user input to an “intent” and extract “entities” from the sentence. In the below illustration, the sentence “add milk and eggs to the list” is sent as a text string to the LUIS endpoint. LUIS returns the JSON seen on the right.
    Language Understanding utterances diagramLanguage Understanding utterances diagram

 

  1. Use the below template to create a LUIS JSON model file where we specify intents and entities manually. After the “IntelligentGrocery” app is created in the LUIS portal under “Import New App”, upload the JSON file with the below intents and entities.

 

 

 

{
      "text": "access the groceries list",
      "intent": "Show",
      "entities": [
        {
          "entity": "ListType",
          "startPos": 11,
          "endPos": 19,
          "children": []
        }
      ]
    },
    {
      "text": "add bread to the grocery list",
      "intent": "Add",
      "entities": [
        {
          "entity": "ListType",
          "startPos": 23,
          "endPos": 29,
          "children": []

 

 

 

The above sample intents are for adding items and accessing the items in the grocery list. Now, it’s your turn to add additional intents to perform the below tasks, using the LUIS portal. Learn more about how to create the intents here.

Intents

Name

Description

CheckOff

Mark the grocery items as purchased.

Confirm

Confirm the previous action.

Delete

Delete items from the grocery list.

 

Once the intents and entities are added, we will need to train and publish the model so the LUIS app can recognize utterances pertaining to these grocery list actions.

Language Understanding (LUIS) PortalLanguage Understanding (LUIS) Portal

 

  1. After the model has been published in the LUIS portal, click ‘Access your endpoint Urls’ and copy the primary key, example query and endpoint URL for the prediction resource.
    Language Understanding endpointLanguage Understanding endpoint

    Language Understanding (LUIS) Prediction viewLanguage Understanding (LUIS) Prediction view

Navigate to the Settings page in the LUIS portal to retrieve the App ID.
Application settingsApplication settings

 

  1. Finally, test your Language Understanding model. The endpoint URL will be in the below format, with your own custom subdomain, and app ID and endpoint key replacing APP-ID, and KEY_ID. Go to the end of the URL and enter an intent; for example, “get me all the items from the grocery list”. The JSON result will identify the top scoring intent and prediction with a confidence score. This is a good test to see if LUIS can learn what should be predicted with the intent.

https://YOUR-CUSTOM-SUBDOMAIN.api.cognitive.microsoft.com/luis/prediction/v3.0/apps/APP-ID/slots/pro...

 

Additional Ideas

We’ve now seen how to build a voice bot leveraging Azure services to automate a common task. We hope it gives you a good starting point towards building bots for other scenarios as well. Try out some of the ideas below to continue building upon your bot and exploring additional Azure AI services.

  • Add Google Home assistant as an additional channel to receive voice commands.
  • Add a PictureBot extension to your bot and add pictures of your grocery items. You will need to create intents that trigger actions that the bot can take, and create entities that require these actions. For example, an intent for the PictureBot may be “SearchPics”. This could trigger Azure Cognitive Search to look for photos, using a “facet” entity to know what to search for. See what other functionality you can come up with!
  • Use Azure QnA maker to enable your bot to answer FAQs from a knowledge base. Add a bit of personality using the chit-chat feature.
  • Integrate Azure Personalizer with your voice chatbot to enables the bot to recommend a list of products to the user, providing a personalized experience.
  • Include Azure Speech service to give your bot a custom, high quality voice, with 200+ Text to Speech options across 54 different locales/languages, as well as customizable Speech to Text capabilities to process voice inputs.
  • Try building this bot using Bot Framework Composer, a visual authoring canvas.
2 Comments
Version history
Last update:
‎Jan 25 2021 04:30 PM
Updated by: