Using VS Code to enhance your machine learning experience
Published Jul 08 2020 03:30 PM 3,684 Views
Microsoft
Hey AML community! The VS Code team is excited to present new capabilities we've added to the Azure Machine Learning (AML) extension. From version 0.6.12 onwards we've introduced UI changes and ways to help you manage Datastores, Datasets, and Compute instances all from directly within your favourite editor!

 

We're guessing many of you may be reading about this extension for the first time - don't worry, we're here to explain!
 
The extension is a companion tool to the AML service. It provides a guided experience to help you create and manage your AML resources from directly within VS Code. The extension aims to streamline tasks such as running experiments, creating compute targets, and managing environments, without requiring the context-switch from the editor to the browser. With an easy-to-navigate tree view you can work across all your workspaces and interact with your core AML assets using single-click commands.
 
If you'd like to learn more and experiment with the extension you can install it here and try the getting started docs here!
 
Datastore Integration
One of the new features we released is the support for Datastore registration. The extension currently supports Azure Blob Storage and Azure File Share datastore types. We've designed a set of streamlined input options to enable faster registrations, such as automatic retrieval of your Account Key credentials to authenticate against the storage account.
 
Register a Blob or File-based datastore in a highly streamlined mannerRegister a Blob or File-based datastore in a highly streamlined manner

 

Dataset Integration
The extension also supports creating Tabular and File datasets from local files or web URLs.
 
Create a Tabular or File Dataset via the extension tree viewCreate a Tabular or File Dataset via the extension tree view

 

Once you've created a Tabular dataset, you can use the extension to preview your data from directly within the editor. In the case of parquet data, the extension may require a profile run before previewing 

 

Preview tabular dataset and filter rows.Preview tabular dataset and filter rows.

 

Via the extension, you can use your datasets during training without having to write extra AML SDK code. Right before submitting, you're shown a partial run configuration which abstracts the complexities of referencing your datasets through an estimator. In the configuration, you just need to input the script parameter and attach mechanism you want to use for File datasets, and the named input you'd like for Tabular datasets.
 
"datasets": {
    // file dataset input
    "mnist-ds": {
        "version": 1,
        "scriptParam": "--data-folder",
        "attachMechanism": "Mount"
    },
    // tabular dataset input
    "titanic-ds": {
        "version": 1,
        "namedInput": "titanic_ds"
    }
}

 

Compute Instance Integration

Creating and managing compute instances has never been easier! You can view all your workspace's compute instances and start/stop/restart them through commands in the tree. With a small number of clicks, you can create an SSH-enabled compute instance from directly within VS Code. Upon creating an SSH-enabled compute instance, you can follow our in-editor documentation to easily connect to your compute via the VS Code Remote SSH extension.
 
Manage compute instances and connect to them via SSHManage compute instances and connect to them via SSH

 

UI Changes
Something we've been hearing for a long time is how the extension UI differs from the Azure ML Studio. In the previous photos you may have already noticed the highly consistent design in the extension tree view. We've updated each node with Studio-equivalent icons and have renamed/reordered them where appropriate.
 
Feedback
As mentioned throughout the blog post, many of the newly released features are in their preliminary phases and we're actively working to support a broader set of scenarios that are consistent with the Azure ML Studio and SDK experiences. Here are some of the scenarios we're actively working on:
  1. Running your Notebooks in VS Code directly on an AML compute instance.
  2. Building and working in Docker containers from an AML environment.
  3. Creating datasets from an existing blob or file-based datastore.
  4. Using AML environments when deploying an endpoint.
If there's anything that you would like us to prioritize, please feel free to let us know on Github!

If you're an existing user of the extension and would like to provide feedback, please feel free to do so via our survey.
Version history
Last update:
‎Jul 08 2020 02:52 PM
Updated by: