Guest post by Ami Zou , Microsoft Student Partner at University College London studying Computer Science, Mathematics and Economics.
In my spare time, I love learning new technologies and going to hackathons. Our hackathon project Pantrylogs using Artificial Intelligence was selected as one of the 10 Microsoft Imagine Cup UK finalists. I’m interested in learning more about AI, Data Science, and Machine Learning to improve the performances of our application.
In this article, I would love to share my experience of using Azure Machine Learning Studio with you. Follow the steps, and within half an hour, you will have a working Machine Learning experiment :D
Azure Machine Learning Studio is a very powerful browser-based, visual drag-and-drop authoring environment.
I love using it because it is very simple. We don’t have to write any code but just need to drag and drop the modules to deploy our ideas. There are many different modules that cover all you needs for machine learning and there are also Python, R, and other programming language modules where you can put customized code to make the algorithm work the way you want.
As a student, we get FREE Azure membership. Yes, free! It costs us nothing to start a Machine Learning experiment and we can use up to 100 modules per experiment and get a $100 free credit for any Azure product see http://aka.ms/azure4students .
Simply register with Azure and get started with Machine Learning :D.
Let’s build a simple ML experiment based on car data together to see how Azure ML Studio work.
There are two parts of the experiment: firstly, we will create a training environment to analyse the car data and train the machine learning experiment; secondly, we will publish it as a predictive experiment and use Linear Regression to predict the price of a car based on its features such as brand, door, bhp and etc.
Here is a snapshot of our final predictive experiment:
You can see we predict the price of an Audi to be £20,000 based on loads of car data against the real price £23,000. We know the model is accurate because Audi is overpriced :)
Ready? Let’s have a closer look:
Before starting the lab, please Download the car data Car prices.csv from GitHub: https://github.com/martinkearn/AI-Services-Workshop/blob/master/MachineLearning/Car%20prices.cs...
1. 1: Create an experiment and load dataFirstly, we need to create a new blank experiment and upload our car data:
This should be what it looks like: a blank experiment named ‘Car Price Prediction’ with Car prices.csv in My Datasets.
1.2 - Add data set
As the starting point in our experiment, we need to add the data.
No codes needed, ML Studio uses a drag-and-drop authoring environment: drag modules from the left side navigation and drop them onto the canvas . ‘ Stitch ’ modules together by connecting the input/output ports (the small circles on the top and bottom of the modules) on the modules (ML Studio will automatically draw a line between them).
Now in our experiment,
(Step 1 and 2)
When you finish, the visualisation should look like this:
1.3 - Clean Data by Removing Rows
A lot of times raw data contains some unnecessary parts and missing values, and we need to clean it to make it an uninformed, ‘prepared’ data for our machine learning experiment.
We will be using the ‘ Clean Missing Data ’ module to remove rows with missing values to produce a clean dataset:
(Step 2)
(Step 3) (Step 4)
(Step 4)
(Step 5)
1.4 - Split DataThe way machine learning works is that we use some actual data to train the algorithm, and then test the algorithm by comparing its output (in our case, the predicted car price) with the actual data (in our case, the actual car price).
Therefore we have to reserve some actual data for testing. Here let’s make it 75% for training and 25% for testing but you can surely modify that:
(Step 2)
(Step 3) (Step 4)
Now the left output port of the Split Data module represents a random 75% of the data and the right output port represents a random 25%.
1.5 - Add Linear RegressionThere are many machine learning algorithms such as Linear Regression, Classification and Regression Tree, Naive Bayes, K-nearest Neighbors and etc (see ‘Top 10 Machine Learning Algorithm’ in the Resource session). For our task of predicting a single data point, the best suitable algorithm is the Linear Regression. We just need to add ‘Linear Regression’ module to the machine learning algorithm:
Here is what it should look like:
1.6 - Train the model on Price
Now comes to the most important part -- using Linear Regression to train the model on the price field. The algorithm learns the factors in the data that impact and affect the price, and then uses those factors to predict the price. The output, predicted price, is called a ‘Scored Label’.
(Step 2)
(Step 3)
(Step 5)
Now we're using the Linear Regression algorithm to train on price using 75% of the data set and reserving the rest 25% of the data for future predicting:
1.7 - Score the Model
Finally, let’s test the performance of our model by comparing it against the remaining 25% of data to see how accurate the price prediction is.
(Step 2 and 3)
(Step 5)
Yay! Now we have a functional training experiment! Let’s jump to the second part -- converting the training experiment to a predictive experiment and using some new data to test the API :D
Let’s convert our training experiment to a ‘predictive experiment’ so we can use it to score new data:
(Step 1)
(Step 2)
(Step 3 and 4)
Here it is what it looks like when it completes - the experiment is not be deployed and there is a screen containing the endpoint, key and some test interfaces .
2.2 - Test the Web Service
Now it is time to use our deployed predictive experiment to test some new car data, get new predicted prices, and see how good our model is!
(Step 2: Click the ‘Test ’hyperlink - not the Blue ‘Test’ Button )
○ make = audi
○ fuel = diesel
○ doors = four
○ body = hatchback
○ drive = fwd
○ weight = 1900
○ engine-size = 150
○ bhp = 150
○ mpg = 55
○ price = 23000
(Step 3)
(Step 4 and 5)
Congrats! Now we have a fully functional predictive experiment! Test it with some other new data or modify the model.
I like Azure because it is so easy to use and we get free student membership. Compared to other ML Resources such as Google ML Kit, we don’t have to write any code but just need to drag and drop the modules in Azure ML Studio. Our free student membership allows as to use up to 100 modules per experiment and has 10GB storage while Amazon ML on AWS charges per hour. Of course if we want to go into production we will have to pay for Azure subscription, but the free membership is far more than enough for studying purpose, and what’s interesting, high-level ML APIs for enterprise producers such as HPE Haven OnDemand is hosted on Azure.
Azure ML Studio is very powerful. For instance, with our car dataset, there are so many other things we can do with the training model. We can normalise the data to make it a standardised dataset (values between 0 and 1). We can pick many different algorithms such as Clustering and Classification from ‘Machine Learning > Initialize Model’ to satisfy our needs for the model. There are also specified modules for data analysis programming languages such as R and Python.
I love it also because there are loads of resources and supportive communities. You can easily find tutorials and examples, and Microsoft Developer Networks has many Machine Learning related forums.
And because it’s free! Azure student membership includes free access to many other interesting and useful products such as Microsoft IoT Hub, SQL Database, and Cognitive Services which I use a lot for Pantrylogs . You can really play around with it and learn something new each time. It is always exciting to experiment some new technologies, isn’t it?
Now go explore Azure Machine Learning Studio and learn more about data and machine learning :D
- Microsoft Azure Machine Learning Studio: https://studio.azureml.net
- GitHub Machine Learning Lab: https://github.com/martinkearn/AI-Services-Workshop/blob/master/MachineLearning/MachineLearning...
- Azure Machine Learning Real-World Examples: https://aischool.microsoft.com/learning-paths/2qon88L7GIWEeUuEaas6wK
- Microsoft Docs: https://docs.microsoft.com/en-us/
- Top 10 Machine Learning Algorithms: https://towardsdatascience.com/a-tour-of-the-top-10-algorithms-for-machine-learning-newbies-dde...
- Basic Machine Learning Tools and Frameworks for Data Scientists and Developers: https://www.computerworlduk.com/galleries/data/machine-learning-tools-harness-artificial-intell...
- Microsoft Developer Networks: https://social.msdn.microsoft.com/Forums/en-US/home?brandIgnore=True
- Microsoft ML Resources: https://docs.microsoft.com/en-us/azure/machine-learning/
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.