Introduction to Machine Learning Notebooks in Microsoft Sentinel
Published Sep 14 2022 08:28 AM 2,410 Views
Microsoft

This article will be constantly updated with new notebook content and resources related to big data analytics for security in Microsoft Sentinel.

With special thanks to @Chi_Nguyen and @Ed_Gardner.

 

Introduction

 

It has never been harder to keep hybrid environments secure.

 

Microsoft’s Security Research teams are observing an increasing number and complexity of cybercrimes occurring across all sectors of critical infrastructure, from targeted ransomware attacks to increasing password and phishing campaigns on email, according to the Microsoft Digital Defense Report. The 2022 Cost of Insider Threats reported that threat incidents have risen by over 44% in the last two years, with associated costs exceeding $15.38M per incident per year, up by a third in the preceding years. The report also concluded that there has been a 10.3% increase in the average time taken to contain an incident, from 77 days to 85 days. Advanced tools, techniques, and processes used by threat actor groups allow them to counter obsolete defences and scale their attack campaigns to a broad range of victims, from government organisations to for-profit enterprises.

 

Machine Learning in Microsoft Sentinel Notebooks

 

Machine Learning has been deployed in security products at Microsoft to enable our security researchers and customers to rapidly and efficiently surface incidents which pose a threat to our customer’s assets, and remediate these incidents quickly to prevent further costs. Microsoft Sentinel, Azure’s cloud-native SIEM and SOAR solution, uses embedded ML algorithms on tens of trillions of signals and threat intel feeds to build intelligent security analytics that discover sophisticated attacks occurring on our customer’s workspace. For example,

 

  • Fusion correlation engine, based on scalable ML models, correlates signals from various products to detect advanced multistage attacks.
  • Customizable ML Anomalies, which are built-in anomaly templates with configurable parameters, provide signals identifying unusual behaviour, to enhance existing detections.

 

In our efforts to democratize Machine Learning for Security Operations, we provide customers with the tools and templates to leverage data science and machine learning for advanced hunting investigations in their cloud environments, using Microsoft Sentinel Notebooks. Sentinel Notebooks have traditionally been used by SOC-Analysts for hunting and detecting specific security scenarios using heuristics and domain expertise. Notebooks are also a common tool for developing, experimenting with, and validating ML models, and can leverage the integration with Azure Machine Learning and Azure Synapse to power large scale analytic....

 

Here is a list of Sentinel Notebooks that your organisation can use to get started in their efforts to integrate ML in their SecOps journey, applied to common attack-vector scenarios:

 

  1. Detect Network Beaching via Intra-Request Time-delta patterns in Microsoft Sentinel – In an effort to better detect network beaconing activity originating from the victim’s infrastructure, the algorithm uses an intra-request time delta pattern with KQL to allow defenders to apply this logic over various network data sources.

vani_asawa_4-1663149452052.png

Detect Network Beaconing Activity - Blog Post Snapshot

 

 

  1. Hunting for Low and Slow Password Sprays using Machine Learning – With the low and slow variant of password spray attacks increasing in popularity, the algorithm uses Bayesian modelling and clustering to fingerprint features of a malicious sign-in attempt.

vani_asawa_5-1663149452060.png

Hunting for Low and Slow Password Sprays - Blog Post Snapshot

 

 

  1. Detect Masqueraded Process Name Anomalies using an ML notebook – In order to seek out when a legitimate process is manipulated, the algorithm uses a modified edit distance logic to find the deviation between the legitimate and malicious process name.

vani_asawa_6-1663149452065.png

Detect Masqueraded Process Name Anomalies - Blog Post Snapshot

 

 

Looking Forward

 

Sentinel Notebooks employing ML for security scenarios is a part of Microsoft Security’s broader goal of empowering our customers to leverage big data analytics tools, processes, and architecture in their environments. We aim to build a centralised platform integrated with the necessary tools for our partners and customers to build machine learning models and algorithms for their specific security use cases and business objectives. We believe that integrating ML with hunting, investigation, and detection scenarios has the potential to better detect and stop attacks from occurring in our customer’s digital infrastructure, improve the ratio of high-fidelity attack signals to the noisier signals, and allow our customer’s to better protect their environments.

 

Frequently Asked Questions

 

Q1. Can I use the same notebook in Azure Synapse, Azure Machine Learning (AML), and Azure Databricks (ADB)?

 

A. It is possible to clone the notebook to your Azure Synapse and AML workspace. ADB is no longer supported by Sentinel Notebooks.

vani_asawa_10-1663149876948.png

Notebooks - Landing Page (ADB is no longer supported by Sentinel)

 

Q2. What compute do I need to run these notebooks? How can I create and manage them?

 

A. Currently, all our Notebooks can be configured to run on Standard CPU compute instances in AML or Synapse, with the default number of compute nodes. For more information about setting up and managing computes instances in AML, including their costs, please check out Create and manage a compute instance - Azure Machine Learning | Microsoft Docs.

 

In Azure Synapse, an Apache Spark Pool defines the requirements of the compute instance to be configured  – check out Apache Spark pool concepts - Azure Synapse Analytics | Microsoft Docs for more information.

 

vani_asawa_8-1663149598666.png

Configure compute instances in AML

 

vani_asawa_11-1663149912921.png

Configure an Apache spark pool in Azure Synapse

 

Q3. How do I load my data to Azure Synapse/AML?

 

A. AML lets you bring data from your source location, either on-premise or cloud-based, into the AML workspace to train and deploy your models. Check out Data access - Azure Machine Learning | Microsoft Docs to start using the v2 SDK, and Secure data access in the cloud v1 - Azure Machine Learning | Microsoft Docs for the v1 SDK.

 

Azure Synapse allows you to create linked services with Azure storage platforms, that can be used to ingest, transform, and model your data. Here is some documentation you can use to get started with loading data into Synapse -

-  Check out Quickstart: to load data into dedicated SQL pool using the copy activity - Azure Synapse Analytics |... to learn how to load data from an Azure SQL DB into Synapse

- Check out Load data into Azure Synapse Analytics - Azure Data Factory & Azure Synapse | Microsoft Docs to learn how to load data from Azure Data Factory or a Synapse Pipeline into Synapse

 

Q4. I have pre-trained models, how do I load it to a storage and use it in my Sentinel notebook?

 

A. Get started with loading and registering your pre-trained models in your AML workspace using the following documentation - Register and work with models - Azure Machine Learning | Microsoft Docs.

Co-Authors
Version history
Last update:
‎Sep 15 2022 01:27 AM
Updated by: