Behind the Scenes: The ML Approach for Detecting Advanced Multistage Attacks with Sentinel Fusion

Published Mar 02 2022 09:06 AM 4,966 Views
Microsoft

Dan Mace, Principal Applied ML Engineer

Haijun Zhai, Senior Applied ML Engineer

Sylvie Liu, Senior Program Manager

 

 

As the volume and types of security events continue to grow, so has the sophistication and velocity of the attackers. As a result, SOC analysts are overwhelmed by the high volume of alerts and incidents and are unable to effectively protect their assets.

 

Using graph-based ML techniques, Microsoft Sentinel Fusion combs through millions of events and identifies high fidelity advanced multistage attacks. In this blog, we'll take you behind the scenes to show you the ML approaches used in Fusion.

 

From Anomalous Signals to High Fidelity Incidents: How Fusion Works End to End

Fusion operates a series of patented machine learning algorithms to look for advanced attacks from millions of anomalous signals. The process includes graph forming, attack pattern matching, expansion, scoring, and incident creation.

 

FusionFlow1.1.png

 

Anomalous signals: Fusion correlates signals from multiple clouds, on-premise, and at the Edge for your entire enterprise, including anomalies, alerts from Microsoft products, as well as alerts from scheduled analytics rules - both built-in and those created by your security analysts — helping you to automatically detect sophisticated, multistage attacks.

 

Graph forming: Fusion builds and continually updates a hyperconnected graph on large scale data sets, typically millions of anomalous signals in a customer workspace. In the graph, the nodes represent the entities and the activities, and the edges represent the relationships between the nodes. The activities are the alerts and anomalies from different sources. The entities can be IP addresses, accounts, Cloud resources, virtual machines, etc.

 

FullGraph-1.1.png

 

Figure 1: Graph formed from a Microsoft Sentinel workspace

 

Attack pattern matching: Fusion keeps a large set of attack patterns in a knowledge pool, including known attack patterns and ML generated emerging attack patterns. The known attack patterns are derived from past true positive incidents and security research. We will deep dive into how ML generates the emerging attack patterns in the next section of the blog.

 

An attack pattern consists of activities (nodes), entities (nodes), and their relationships (edges). In this step, Fusion constantly takes attack patterns from the knowledge pool and identifies matches in the hyperconnected graph. Those identified matches are called subgraphs. This step reduces the millions of anomalous signals to a smaller set of subgraphs representing possible attacks. In the example below, three attack patterns are matched in the graph. There are 4 nodes and 3 edges in the top subgraph.

 

GraphFigure1.1.png

Figure 2: Simplified graph shows nodes and edges from attack pattern matching

 

Expansion: During the expansion phase, Fusion expands the matched attack patterns to discover additional activities and entities that are relevant.

 

GraphFigure2.1.png

Figure 3: Simplified graph shows nodes and edges from attack pattern matching and expansion

 

How does Fusion determine the relevance and know how far to expand?

  • Calculate relevance: in the graph, the edges represent the relationship between the nodes. Fusion first uses an ML algorithm to assign a weight to each edge in the full graph to determine the relevance of the nodes by taking information including time range, kill chain intent, severity, entity type into consideration.
  • Run probabilistic random walk: a probabilistic kill chain model is then applied to determine viable attack paths in the graph from the matched patterns. The model runs multiple times to simulate different attack paths. In the example below, A and B represent the nodes in a matched attack pattern and D, E, F, G represent the relevant activities and entities. In the real world, the subgraphs and attack paths are much more complicated and can be time consuming for security analysts to manually complete the process.

Expansion1.png

Figure 4: Expansion - probabilistic random walk

 

  • Aggregate weights and apply threshold: after simulating the different attack paths, Fusion aggregates weights across multiple runs. The algorithm first applies a threshold at the subgraph level to drop the subgraphs that represent unlikely attack paths. It then applies a threshold for each edge to determine how far to expand the graph.

Expansion2.png

Figure 5: Expansion - aggregate weights and apply threshold

 

Scoring and incident creation: Once the subgraphs representing possible attacks are identified, Fusion applies a round of scoring and triggers incidents that includes the most relevant alerts, anomalies, and entities to further reduce alert volume and speedup investigation.

 

In this step, Fusion uses k-nearest neighbors (KNN) to calculate the killchain reachability of an attack and identify the nodes that have highest relevance in a real attack. In the example below, all the colored nodes (orange, yellow, green) are relevant to an attack. After the scoring round, Fusion only surfaces the nodes that have the highest relevance (orange and yellow colored nodes) in an incident. This way the security analysts only need to investigate a focused set of the most relevant activities and entities to quickly understand an attack.

 

GraphFigure3.1.png

 Figure 6: Simplified graph shows nodes and edges from attack pattern matching, expansion and scoring

 

The example in Figure 7 shows a possible attack that started with initial access from the Cloud to endpoint execution, and then moved on to consistent beaconing from an internal IP address to a suspicious external IP address, and possible Command and Control in roughly 24 hours. The Fusion ML algorithms detected this attack by correlating an anomaly (Anomalous Azure AD sign-in sessions), as well as alerts from custom scheduled rules, Azure Defender, and Microsoft Defender for Endpoint.

 

Incident.png

Figure 7: Fusion incident in Microsoft Sentinel workspace

 

Detecting Emerging Attack Patterns: Emerging TTP Discovery with Auto-Fusion

Now that you’ve seen how Fusion works end to end, we’ll expand on how Fusion discovers emerging attack patterns that contributes to the knowledge pool.

 

FusionFlow-AF-1.1.png

Detecting attack patterns as they evolve is a challenging task, since attackers are constantly updating their techniques and approaches. To address this, we developed Auto-Fusion algorithms that constantly learn and update our knowledge of the evolving adversary behaviors. This data driven approach allows Fusion to continually refresh its understanding of the threat landscape, adding and updating new attack patterns all the time — even while you’re sleeping!

 

We often use TTP’s (Tactics, Techniques, and Procedures) for identifying patterns of adversary behavior, which characterizes what the attackers are doing and how the attackers are doing it. Detecting emerging attacks patterns consists of a series of processing and inference steps to identify emerging TTP’s.

 

The process is similar to how security analysts manually correlate signals together, associate them with kill chain stages, and apply their security knowledge and analysis to identify possible threats, except that Fusion runs at the cloud scale consistently, and never gets tired.

 

AutoFusion.png

Figure 8: Overview of Auto-Fusion emerging TTP discovery

 

Subgraph building: In the Fusion pipeline we discussed in the first section, step one is to form a graph based on millions of anomalous signals. Auto-Fusion then builds subgraphs (also millions of them) from the full graph based on the connectivity and associations determined from their entities and kill chain information. These subgraphs represent possible chains of security events that are happening in your environment.

 

Identify common patterns with association pruning: Within these chains of security events, there are always common reoccurring patterns that we can observe. To identify these common patterns, we run statistical association tests on the subgraphs based on general characteristics including timing, kill chain stages, entity type, alert type, severity, provider, etc. These common patterns usually contain multiple entity types and activities. As illustrated in Figure 8above, the four common patterns in the association pruning step are observed from the subgraphs on the left side. The nodes represent activities (blue) and entity types (green).

 

Surface meaningful TTP’s with attack inference: However, not all the common patterns represent meaningful TTP’s. After the first association pass, we apply counterfactual inference to identify only the subset of common patterns that are likely to be explained by intentional activities undertaken by an attacker (candidate TTP’s), and are not correlations occurring by chance. This step is important for removing noise and background effects from the common pattern pool.

 

Identify new attack patterns with attack scoring: Before considering the set of candidate TTP’s as emerging attack patterns, Auto-Fusion leverages a trained supervised model to score the candidate TTP’s to identify the ones that have valid security implications. The ML labels are created from the constant learnings of known attack patterns, IoCs, past incidents, analyst feedback and research from Microsoft internal security teams. Using this collective information as labels, we train a classifier/scorer to determine which of the new candidate TTP’s follows a similar characteristic of known attack patterns, and select new attack patterns.

 

These new attack patterns identified by Auto-Fusion ML algorithms, along with the known attack patterns in the knowledge pool, are used in Fusion for detecting advanced attacks in your environments.  

 

Conclusion

Fusion uses multiple patented ML algorithms to detect advanced multistage attacks by correlating signals from end points, network, and multi-clouds – basically all the assets monitored in your Microsoft Sentinel workspace. It relieves SOC analysts from tedious, time consuming and high cognitive workload.

 

The Auto-Fusion ML algorithms also constantly learn from existing attacks and apply analysis based on how security analysts think to help you keep up with threats that are on the horizon and stay one step ahead of attackers.

 

To learn more about the latest Fusion feature releases, check out Microsoft Sentinel Fusion. If you have additional feedback for your product experience, share your insights through “Guides & Feedback” in Sentinel, our team would love to hear from you!

 

For more information:

1 Comment
Co-Authors
Version history
Last update:
‎Mar 02 2022 01:35 PM
Updated by: