Starting your Kaggle challenge using Azure Machine Learning Services

Published Jun 20 2022 03:59 AM 1,410 Views
Microsoft

Dataset

As part of this tutorial, we will be loading the AMEX data - integer dtypes - parquet format | Kaggle available on kaggle. 

Setting up your Kaggle API key
 Kaggle username and API key. To create a key:

  • Go to your kaggle account → Settings → Account → Create a new API token.
  • A kaggle.json file will be downloaded and it will contain your username and API key. Keep this somewhere safe!

Connect Kaggle data in Azure

Go to Azure Portal.

Click on Create a Resource -> Search for Machine Learning.
AzureML.png


Click Create and follow the steps until you reach the following page.
AzureMLcreate.png

 

AzureMLcomplete.png

AzureMLLaunch.png

Click on notebooks from the sidebar and click create a new notebook.

NoetbooksAzureml.png

 

nocomputenotebook.png

 

Create a Compute resource (if one does not exist already).

createcompute.png

Once the compute has been created

mlcompute.png
Next, open the terminal.

mlterminal.png

 

On the terminal window, pip install kaggle.

mlinstallkaggle.png

Set Kaggle username and API key (from the json file) as environment variables in the terminal:

kaggleusername.jpg

Finally, you are ready to download your Kaggle dataset via the command line in the terminal. The API command to do so is available on the Kaggle dataset page itself. Click on the three dots next to New Notebook and select ‘copy API command’.

kaggleapi.png

Next paste (or CTRL +V) on the terminal window

Note: This might take a while as you can see the file is approx 4GB in size

kaggledownload.png

Voila…. you will see your dataset will be downloaded (as a zip file) in your current working directory onto your Azure workspace.

Alternatively, you can specify a folder where the files should be downloaded using optional arguments in the API call (for more info, see Kaggle documentation here). For example:

 

 The following code goes in your Notebook.

 

unzipml.png

And there you have it. All your data would be unzipped into a new folder, which will be sitting in your current working directory on Azure.

datamlkaggle.png

Co-Authors
Version history
Last update:
‎Jun 29 2022 11:03 PM
Updated by: