Train a simple Recommendation Engine using the new Azure AI Studio
Published Apr 30 2024 12:00 AM 4,028 Views
Copper Contributor

Hi, everyone! I am Paschal Alaemezie, a Gold Microsoft Learn Student Ambassador. I am a student at the Federal University of Technology, Owerri (FUTO). I am interested in Artificial Intelligence, Software Engineering, and Emerging technologies, and how to apply the knowledge from these technologies in writing and building cool solutions to the challenges we face. Feel free to connect with me on LinkedIn and GitHub or follow me on X (Twitter).

 

In my previous article, I wrote about Recommendation Engines and gave a walkthrough on how to train a simple recommendation engine using the Azure Machine Learning Designer via the Azure portal. In this article, I will give a walkthrough on how to replicate this training using Azure Machine Learning Designer via the new Azure AI Studio.

 

The new Azure AI Studio is a comprehensive platform designed to facilitate the development, management, and deployment of AI applications. It offers a user-friendly interface with drag-and-drop capabilities for model creation, alongside advanced features for model management and scalability. The platform supports automated machine learning to optimize model selection and tuning. It is suitable for creating custom AI solutions, including chatbots and other AI-driven applications, with a focus on collaboration, efficiency, and responsible AI practices. Azure AI Studio is available in public preview, providing a glimpse into the future of AI development tools.

 

An Azure subscription is required to carry out the activities in this article. If you are a student, you can use your university or school email to sign up for a free Azure for Students account and start building on the Azure cloud with a free $100 Azure credit.

 

Activity 1: Create a New Training Pipeline

Step 1: Setting up your Azure AI Studio workspace

  1. Open your web browser and go to ai.azure.com to open the new Azure AI Studio

palaemezie_0-1713956744085.png

 

  1. Go to Build on the Azure AI Studio and click on it to open the Build environment. Then click on + New project button to open the Create a project environment.

palaemezie_1-1713956744094.png

 

Step 2: Creating your project 

  1. For the Project details section:
  2.  At Hub name, key in your preferred name for your project’s hub in the input box provided.
  3. At Subscription, select your existing subscription from the drop-down menu.
  4. Select your Resource group. If you have any existing resource group, select it from the drop-down menu. Otherwise, click on Create new to create a new resource group, and click OK after that.
  5. At Location, select your location from the drop-down menu. Then, click on the Next button at the bottom of the screen to go to the Review and finish.

palaemezie_2-1713956744099.png

 

  1. At the Review and finish section, click on Create a project button at the bottom of the screen to provision your workspace on Azure AI Studio.

palaemezie_3-1713956744106.png

 

  1. Your provisioned workspace will display the window below. Go to the All Azure AI at the upper right of the screen and select the Azure Machine Learning Studio from the drop-down menu.

 

palaemezie_4-1713956744112.png

 

 

  1. In the Azure Machine Learning studio, select Designer from the navigation pane on the left-hand side. This will open the Designer environment where you can select a new pipeline if there is no existing pipeline.

palaemezie_5-1713956744137.png

 

  1. In the Designer environment, select the Classic prebuilt component. Then click on the Create a new pipeline using classic prebuilt components. This will open a visual pipeline authoring editor.

palaemezie_6-1713956744147.png

 

Step 3: Add Sample Datasets

  1. In the left navigation pane of the Authoring editor, click the Asset library and go to the Component section. Under Component, click on Sample data.

palaemezie_7-1713956744153.png

 

  1. In the Sample data, scroll down to the Movie Ratings, and IMDB Movie Titles. Drag and drop the selected datasets onto the canvas.

palaemezie_8-1713956744159.png

 

Step 4: Join the two datasets on Movie ID

  1.  Close the Sample data drop-down menu. From the Data Transformation section in the left navigation, select the Join Data prebuilt module, and drag and drop the selected module onto the canvas
    1. Connect the output of the Movie Ratings module to the first input of the Join Data module.
    2. Connect the output of the IMDB Movie Titles module to the second input of the Join Data module.

palaemezie_9-1713956744162.png

 

  1. Select the Join Data module. Click the navigation button at the upper right of the canvas to open the Join Data module window.

palaemezie_10-1713956744164.png

 

  1. Select the Edit column link to open the Join key columns for the left dataset editor. Select the MovieId column in the Enter column name field and click Save.

palaemezie_11-1713956744167.png

 

  1. Select the Edit column link to open the Join key columns for the right dataset editor. Select the Movie ID column in the Enter column name field and click Save. Then, close the Join Data window.

palaemezie_12-1713956744170.png

 

Step 5: Select Columns UserId, Movie Name, and Rating using a Python script

  1. From the Python Language section in the left navigation, select the Execute Python Script prebuilt module. Drag and drop the selected module onto the canvas. Then, connect the Join Data output to the input of the Execute Python Script module.

palaemezie_13-1713956744174.png

 

  1. Select Edit code to open the Python script editor, clear the existing code and then enter the following lines of code to select the UserId, Movie Name, and Rating columns from the joined dataset. Ensure best practice by indenting only the second and third lines of your code.

palaemezie_14-1713956744180.png

 

Step 6: Remove duplicate rows with the same Movie Name and UserId

  1.  From the Data Transformation section in the left navigation pane, select the Remove Duplicate Rows prebuilt module from the drop-down menu, and drag and drop the selected module onto the canvas.
    1. Connect the first output of the Execute Python Script to the input of the Remove Duplicate Rows module.

palaemezie_15-1713956744182.png

 

    1. Select the Edit column link to open the Select column editor. Click the navigation button at the upper right of the canvas to open the Remove Duplicate Rows module window.

palaemezie_16-1713956744185.png

 

    1.  Enter the following list of columns to be included in the output dataset: Movie NameUserId. Then, click Save.

palaemezie_17-1713956744186.png

 

Step 7: Split the dataset into a training set (0.5) and a test set (0.5)

  1.  From the Data Transformation section in the left navigation select the Split Data prebuilt module and drag and drop the selected module onto the canvas, then connect the Dataset to the Split Data module.

 

  1.  Click the navigation button at the upper right of the canvas to open the Split Data module window. Ensure that the Fraction of rows in the first output dataset0.5

 

palaemezie_18-1713956744188.png

 

Step 8: Initialize Recommendation Module

  1.  From the Recommendation section in the left navigation pane, select the Train SVD Recommender prebuilt module and drag and drop the selected module onto the canvas. Then, connect the first output of the Split Data module to the input of the Train SVD Recommender module.

palaemezie_19-1713956744190.png

 

    1. Click the navigation button at the upper right of the canvas to open the Train SVD Recommender module window. Set Number of factors200. This option specifies the number of factors to use with the recommender.
    2. Number of recommendation algorithm iterations30. This number indicates how many times the algorithm should process the input data. The default value is 30.
    3. For Learning rate0.001. The learning rate defines the step size for learning.

palaemezie_20-1713956744192.png

 

Step 9: Select Columns UserId, Movie Name from the test set

  1. From the Data Transformation section in the left navigation pane, select the Select Columns in Dataset prebuilt module and drag and drop the selected module onto the canvas. Then, connect the Split Data second output to the input of the Select columns in Dataset module.

palaemezie_21-1713956744195.png

 

  1. Click the navigation button at the upper right of the canvas to open the Select Columns in Dataset module window. Select the Edit column link to open the Select columns editor.

palaemezie_22-1713956744196.png

 

  1. Enter the following list of columns to be included in the output dataset: UserIdMovie Name and Click Save.

palaemezie_23-1713956744198.png

 

Step 10: Configure the Score SVD Recommender

  1. From the Recommendation section in the left navigation pane, select the Score SVD Recommender prebuilt module and drag and drop the selected module onto the canvas
    1. Connect the output of the Train SVD Recommender module to the first input of the Score SVD Recommender module, which is the Trained SVD recommendation input.
    2. Connect the output of the Select Columns in Dataset module to the second input of the Score SVD Recommender module, which is the Dataset to score input.

palaemezie_24-1713956744202.png

 

    1. Open the Score SVD Recommender module on the canvas by clicking on the navigation button at the upper right of the canvas. Set the Recommender prediction kindRating Prediction. For this option, no other parameters are required.

palaemezie_25-1713956744203.png

 

Step 11: Setup Evaluate Recommender Module

  1. From the Recommendation section in the left navigation pane, select the Evaluate Recommender prebuilt module and drag and drop the selected module onto the canvas.
    1. Connect the Score SVD Recommender module to the second input of the Evaluate Recommender module, which is the Scored dataset input.
    2. Connect the second output of the Split Data module (train set) to the first input of the Evaluate Recommender module, which is the Test dataset input.

palaemezie_26-1713956744207.png

 

Activity 2: Submit Training Pipeline

  1. In the Authoring editor, ensure that you have AutoSave enabled. Then click on Configure & Submit at the upper right-hand side of your screen.

 

 

  1. For the Set up pipeline job window: In the Basics section, click the Create new button under the Experiment name. Type your new experiment name and click the Next button at the bottom of the screen.

palaemezie_28-1713956744212.png

 

  1. In the Inputs & outputs section, click the Next button at the bottom of the screen.

palaemezie_29-1713956744214.png

 

  1. In the Runtime settings section: skip the Default compute. Go to the select compute type and select Compute instance from the drop-down menu. Under the Select Azure ML compute instance, click on Create Azure ML compute instance. The Create compute instance will open in another environment.

 

palaemezie_30-1713956744220.png

 

 

  1. In the Create compute instance window, type in your compute name under the Compute name tab. Then, select the CPU button under the Virtual machine type.

palaemezie_31-1713956744225.png

 

  1. While authoring this article, I had to select my virtual machine first to enable the Compute name tab. You may or may not encounter this issue. I selected the Standard_D2_v2 virtual machine for this training. After that, click the Review + Create button at the end of the screen, to take you back to the Runtime settings window.

palaemezie_32-1713956744229.png

  1. Back to the Runtime settings window. At the Select Azure ML compute instance, Select the compute instance that you have created. Here, I selected the movie instance from the drop-down menu. Note that your newly created compute instance will take some time to be provisioned and appear in your drop-down menu. Go to the Advanced settings and ensure that the Continue on step failure box is checked. Then, click the Review + Submit button at the end of the screen.

palaemezie_33-1713956744233.png

  1. At the Review + Submit section, ensure that your provided details are correct. Then, click the Submit button at the end of the screen.

palaemezie_34-1713956744237.png

 

 

Activity 3: Visualize Scoring Results

Step 1: When your pipeline is submitted and your model training is completed, at the left navigation pane, go to Jobs under Asset and click on the name of your completed pipeline.

palaemezie_35-1713956744253.png

 

Step 2: Visualize the Scored dataset

  1. Go to the Score SVD Recommender module on the canvas and right-click on it. Select Preview data and click on Scored dataset.

palaemezie_36-1713956744279.png

 

  1. Observe the predicted values under the column Rating.

palaemezie_37-1713956744296.png

 

Step 3: Visualize the Evaluation Results

  1. Go to the Evaluate Recommender module on the canvas and right-click on it. Select Preview data and click on Metric.

palaemezie_38-1713956744313.png

 

  1. Evaluate the model performance by reviewing the various evaluation metrics, such as Mean Absolute ErrorRoot Mean Squared Error, etc.

palaemezie_39-1713956744323.png

 

 Next step

Congratulations, on making it this far. Stay tuned for my next blog on the amazing solutions you can build using the Azure AI Studio.

 

For enthusiasts and professionals alike, you can leverage these resources to stay informed and inspired as you embark on your AI journey:

Co-Authors
Version history
Last update:
‎Apr 24 2024 10:58 AM
Updated by: