Predicting Stock Price using Azure Automated Machine Learning (AutoML) in few clicks
Published Sep 20 2022 05:53 AM 1,027 Views
Microsoft

In a typical machine learning process, we first go through data featurization activities, including missing and imbalanced data handling exercises. We then select an appropriate algorithm to handle the business problems we are trying to solve through the machine learning model, followed by a hyper parameter tuning exercise to choose an optimal set of parameters for our model. The process is complex, time consuming and would often require data scientist and data analytics expertise, even before we can decide if the exercise is reasonable business project to take it. This is where Automated Machine Learning can help a big way.

 

Automated machine learning, also referred to as automated ML or AutoML, is the process of automating the time-consuming, iterative tasks of machine learning model development. It allows data scientists, analysts, and developers to build ML models with high scale, efficiency, and productivity, all while sustaining model quality.

 

automatedml.png

 

In this walkthrough, we will explore how easy it is to take the historical stock price data and make predictions on the stock price through Azure Automated Machine Learning (AutoML), following low code, no-code approach, with few clicks and without much data scientist knowledge to spare.

 

Step 1: Create Data Asset

To simplify the discussion, we will assume the stock data from yahoo finance is already downloaded to our local machine. The details of data ingestion are discussed LearnAI/data-generator.md at main · msdpalam/LearnAI (github.com).

We will also assume that Azure ML workspace is already created.

  1. Sign in to Azure Machine Learning studio.
  2. Select your subscription and workspace.
  3. From the Assets section, select Data

create data asset.png

  1. Provide a name for your dataset in the appeared dialog box, keep the Dataset type as Tabular since our stock is in tabular format

basicinfo.png

 

  1. Click Browse in the dialog box appeared and select Browse files

browsefiles.png

6. Navigate to the folder where you downloaded stock data, select the csv file:

 

stockdata.png

7. Here is how it would look after you select the data file:

browsefiles.png

Note: The file will be uploaded to the selected data store.

Click Next

  1. The dialog box appeared shows the settings and the preview of the data asset.

settings and preview.png

Click Next

  1. The next screen shows the schema for the data asset

schema.png

 

  1. In the next screen, confirm the details of the data asset.

confirmdetails.png

Click Create.

 

Please note the data asset name and the location (blob store where it is uploaded)

createddataassetdetails.png

Clicking on the workspaceblobstore, you can find the stock_data.csv file, in the following location of your default blob data store

datastorelocation.png       

 

Step 2: Create New Automated ML Job

Now that we have the data asset, let us create an Automated ML job in Azure ML Studio

  1. Sign in to Azure Machine Learning studio.
  2. Select your subscription and the workspace you created.
  3. In the left pane, select Automated ML under the Author section.
  4. Select +New automated ML job.

createautomljob1.png

 

  1. From the appeared dialog box, select the data asset we created in Step 1, as shown below:

selectdataasset.png

If you click on the data asset link, it will show you preview.

Click Next

  1. In the Configure Job dialog box,
    1. Note that the data asset is already selected
    2. Create New or use Select existing experiment name for the job.
    3. Select the Target Column from the drop-down list of columns
    4. Select Compute Type (default is compute cluster), for the job
    5. Select an existing Compute Cluster (if you do not already have one, create a cluster, following the article Create compute clusters - Azure Machine Learning | Microsoft Learn.

configurejobautoml.png

Click Next

  1. Under Select task and settings dialog
    1. Select Regression, as it fits the characteristics of an algorithm for the stock price prediction

algorithmselection.png

 

b. Under View additional configuration setting (click the link to open the dialog), select Primary metric (default is set to Normalized root mean squared error), make sure explain model check box selected and if you want to explore all the models, select Use all supported models check box. You can then block the models you do not want to try out, from the drop-down menu.

 

additionalsettings.png

 

c. You can also click on the link view featurization settings, to explore the current featurization settings and adjust as needed

 

featurization.png

Click Next

d. For the Hyperparameter option keep the default

 

hyperparameter.png

Click Finish

  1. A new AutoML job will be created with Not Started state, which soon will be in the Running state.

MeerAlam_1-1663517069450.png

  1. Since we configured the Training job time to 0.5 hours, it would finish within the next half an hour
  2. Once the job completes, we do see that the job indeed completed in about 27 minutes

jobcompletes.png

 

Step 3: Explore the best model

Upon completion of the AutoML job, we can explore the models trained and look for the best model to deploy as an endpoint to provide that model’s inference capabilities. Let us review the models.

  1. Click on the Models tab from the completed job dialog

exploremodel.png

 

  1. We can see the VotingEnsemble algorithm was the best model, based on the Normalized Root Means Squared Error metric. It is important to note that AutoML, automatically tried algorithms, based on the configuration, without us writing a single line of code, within a very short period of time (~27minutes)

bestmodel.png

 

Step 4: Deploy the best model for inferencing to an endpoint.

An endpoint is an HTTPS endpoint that clients can call to receive the inferencing (scoring) output of a trained model. A single endpoint can contain multiple deployments and deployment is a set of resources required for hosting the model that does the actual inferencing.

  1. Click on the VotingEnsemble Algorithm link from the Models dialog, to investigate the details of the model

bestmodeloverview.png

votingensemblealgo.png

While we can explore the model Metrics, Experiment details, Duration for training, Inputs etc. to complete our exercise, we will deploy the model and score against the model with new set of data.

  1. Click on Deploy and select Deploy to a web service. How to deploy an AutoML model to an online endpoint, has another option of deployment. For details on online endpoints, refer to What are Azure Machine Learning endpoints?
 

deploymodel1.png

 

  1. We will choose Container Instance as the web service deployment option for this example; however, Azure Kubernetes Service is a deployment option we often find.

deploymodel2.png

MeerAlam_9-1663517575791.png

4. Click Deploy to start the deployment. The model deployment will be triggered

 

modeldeploymentstarted.png

 

5. Once the deployment completes, the deployment state will show Healthy, a REST endpoint will be created and a Swagger URI will be created, that will show how we can interact with the endpoint. We will skip exploring the Swagger URI for this walkthrough.

 

deploymentsuccessful.png

Step 5: Scoring (inferencing) against the deployed model.

Scoring (or inferencing) is the process of running live data points into a machine learning algorithm (or ML model) to calculate an output. Now that the ML model is deployed, there are multiple ways you can inference an output against the deployed ML model. The simplest option is to use the Test feature in the ML Studio UI for the endpoint. Simply click on the Test Tab and provide the test data set to see the results (predicted close of the MSFT Stock)

 

inferencing.png

Please note that when you first click on the Test tab, it will only show the sample data format, as below:

 

{

  "Inputs": {

    "data": [

      {

        "Datetime": "2000-01-01T00:00:00.000Z",

        "Open": 0.0,

        "High": 0.0,

        "Low": 0.0,

        "Adj Close": 0.0,

        "Volume": 0

      }

    ]

  },

  "GlobalParameters": 0.0

}

 

In this test, we provided one data point for the MSFT stock price as below, picking up the first row from our stock_data.sv file, however we can provide multiple data points, following the JSON format of the data.

 

{

  "Inputs": {

    "data": [

      {

        "Datetime": "2022-08-01 09:30:00-04:00",

        "Open": 277.82000732421875,

        "High": 277.9399108886719,

        "Low": 277.07000732421875,

        "Adj Close": 277.67999267578125,

        "Volume": 731044

      }

    ]

  },

  "GlobalParameters": 0.0

}

 

 

As we observe, using AutoML, it is very easy to take a prepared dataset and apply the algorithm of your choice to create and deploy a machine learning model that predicts the stock price, without writing a single line of code.

 

In part 2, we will walk through the stock price prediction exercise through Azure Machine Learning designer, yet another low code, no-code approach to perform machine learning tasks, leveraging Azure Machine Learning.

Co-Authors
Version history
Last update:
‎Sep 20 2022 05:48 AM
Updated by: