Artificial Intelligence (AI) study and use is on the rise. Tools to enable AI are becoming more readily available, simpler to use and easier to implement. What's more is that the definition of AI itself has been broken down into ingredients that, when later applied into a recipe (or process), can provide multiple desired outcomes. One of the more important ingredients used in most recipes is Machine Learning. Machine Learning in essence is a way of teaching computers to provide more accurate predictions on provided data. These predictions can also make apps and devices smarter by providing recommendations as an outcome to the data.
In the pursuit of making roads safer, Toyota Canada has been capturing data from mechanics in all of Toyota Canada's 300 dealerships on the vehicles they repair. In the past, the repair data was extracted from Toyota Canada's service application manually and stored in databases on premise to later be analyzed. While parts of the analytics process were automated, the entire process took over 6 months to process the reams of data to provide a part replacement recommendation.
Toyota Canada wanted to reduce the process time and so approached Microsoft to collaborate in a Machine Learning Hackfest to come up with a solution. While we are unable to detail the exact process undertaken by Toyota Canada and Microsoft as completed during the Hackfest itself, this post will walk through steps accomplishing a similar exercise to enable further understanding of the Machine Learning process. The step-by-step detailed below will set up a pricing prediction of specific vehicles. Lets get started.
To begin this exercise, navigate to https://studio.azureml.net and select Sign up here. Next choose between free and paid options to complete this exercise. NOTE: Select Sign In if you have already completed a Machine Learning experiment previously and simple enter your credentials. You are ready to begin the exercise once you are able to access the Microsoft Azure Machine Learning Studio.
Next you'll need to acquire data to analyze. Machine Learning Studio has many sample datasets to choose from or you can even import your own dataset from almost any source. In keeping with the automotive theme, the Automobile price data (Raw) dataset will be used in this exercise. This dataset provides data on various cars including make, model, price and specifications The first thing you need to perform machine learning is data. There are several sample datasets included with Machine Learning Studio that you can use, or you can import data from many sources. For this example, we'll use the sample dataset, Automobile price data (Raw), that's included in your workspace. This dataset includes entries for various individual automobiles, including information such as make, model, technical specifications, and price. NOTE: All data used in this exercise is factitious and does not represent the current automotive market. Let's now capture the dataset for this experiment.
Preprocessing the dataset is needed to ensure missing values are addressed prior to running the prediction exercise. As noted in the newly added automotive dataset, the normalized-losses column is missing many values and will have to be excluded to provide a better prediction.
Machine Leaning Features are individual measurable properties that are of interest. In Automotive Price dataset, each row represents one car, and each column is a feature of that vehicle. Experimentation and knowledge about the problem you want to solve are needed to find a good set of features to create a predictive model. This experiment will build a model that uses a subset of the features in the automotive dataset. These features include:
make, body-style, wheel-base, engine-size, horsepower, peak-rpm, highway-mpg, price
Step 5: Selecting and Applying a Learning Algorithm With the appropriate data now repaired, training and testing of a predictive model can now commence. The data will now be uses to train the model and test the model to review price prediction. For this experiment the regression machine learning algorithm will be used. Regression is used to predict a number which will come in handing when predicting pricing. More specifically, this experiment will use the simple linear regression model. The data itself will be used for both training the model and testing. This is completed by splitting the data into separate training and testing datasets.
The experiment can now score the 25 percent of data to how the model functions being trained on the other 75 percent.
Congratulations as you have now completed your first machine learning experiment. Next steps would be to try an improve the prediction and then deploy it as a predictive web service. Experiment further by adding multiple machine learning algorithms, modifying the properties of the Linear Regression algorithm or trying a different algorithm altogether.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.