Microsoft Sentinel Support for Ingestion-Time Data Transformations
Published Mar 07 2022 08:00 AM 11.9K Views
Microsoft

Note: thank you @Javier Soriano and @Edi Lahav for co-writing this blog and assistance with this preview

Log Analytics has recently announced two new features: ingestion time transformations and Data Collection Rules (DCR)-based custom logs. This is a huge milestone not only for Log Analytics, but also for Microsoft Sentinel, as it enables a wide range of scenarios like filtering, masking, enrichments, and parsing; allowing Sentinel's customers to optimize storage costs, improve their security analytics, and enjoy better performance and ease of use.

 

The following diagram shows the new data flows for Sentinel's data connectors with the new ingestion-time transformations and DCR based custom logs features:  

Oded_Weber_0-1646291496057.png

 

As illustrated in the diagram, for custom logs users can now set the columns' names and types and they can decide whether to ingest the data into a custom table or into a standard table. For standard logs customers can now define their own transformations on top of the pre-configured workflows.

By using the new features Microsoft Sentinel customers can enjoy the following benefits:

  • Cost reduction - using ingestion time transformations, customers can now filter out data which is irrelevant to security analysis.
  • Improved analytics - by explicitly defining the output schema, by removing irrelevant data and by enriching the data with additional information, customers can now standardize the data according to the SOC analysts' needs.
  • Better performance - performing the transformations in the pipeline reduces the need for performing query time adjustments that were previously required to standardize the ingested data.
  • Ease of use – the new features reduce the need to use 3rd party tools to perform filtering, masking  and other types of data transformations.

Examples of new scenarios enabled by ingestion time transformations

 

Filtering

Filtering incoming logs is essential to avoid noise and to optimize your ingestion costs. Filtering can be done by removing unnecessary fields in the record or by completely discarding the whole if it has no value for the SOC team. For example, your team might not be interested in ingesting a field that contains redundant information.

 

Tagging/Enrichment

Users can enrich or tag the data with additional columns. These columns may include parsed data from other columns or data taken from static tables added to the configured KQL transformation. For example, some companies want to add an additional field that indicates which department owns the record that is being ingested. For this, you can define your own mapping within the transformation KQL, so each event is tagged accordingly.

 

In the example below you’ll see a demo of how to create and deploy the Data Collection Rule (DCR) which includes an example for the above use cases. Then we’ll see how the Data Collection Rule (DCR) impacts the ingested log.

 

 

CLv2-demo.gif

 

 

Filtering / Enrichment Example:

source 
| where Action contains 'REJECT'  // filter the entire event upon a value in one of the fields
| project-away Version, InterfaceId  // filter fields from the event
| extend Int_Ext_IP_CF = case(toint(case(substring(SrcAddr, 0, 3) contains '.', substring(SrcAddr, 0, 2), substring(SrcAddr, 0, 3))) > 100, 'Internal IP', 'External IP') // add a custom field that will contain an enrichment tag related to and IP address

 

Masking

Ingestion time transformations can be used to mask or remove personal information such as Social Security Numbers, Credit Card information, email addresses, etc.

 

Masking example (masks first 2 sections of SSN number):

source 
| extend parsedSSN = split(SSN,'-') 
| extend SSN = iif(SSN matches regex @'^\\d{3}-\\d{2}-\\d{4}$' 
and not( SSN matches regex @'^(000|666|9)-\\d{2}-\\d{4}$') 
and not( SSN matches regex @'^\\d{3}-00-\\d{4}$') 
and not (SSN matches regex @'^\\d{3}-\\d{2}-0000$' ),strcat('XXX','-', 'XX','-',parsedSSN[2]), 'Invalid SSN') 

Microsoft Sentinel Transformations Library

As part of this announcement, we are also releasing a library of transformations to help minimize the effort required to adopt these features. You can find this library here: http://aka.ms/sentinel-transforms 

 

Please feel free to raise new issues if you want to provide feedback or if you’d like to see a specific use case added!

Next Steps

For more in-depth information about Log Analytics new features and in order to better understand how to configure ingestion time transformations and DCR-based custom logs for your data connectors, refer to the following Microsoft Sentinel documentation:

 

What’s New: What's new in Microsoft Sentinel | Microsoft Docs

Conceptual: Custom data ingestion and transformation in Microsoft Sentinel (preview) | Microsoft Docs

How-to: Transform or customize data at ingestion time in Microsoft Sentinel (preview) | Microsoft Docs

Reference: Find your Microsoft Sentinel data connector | Microsoft Docs (adding details about DCR support for each connector)

 

As of now, the Public Preview for these new features requires registration. To sign up use the following link: https://aka.ms/CustomLogsPreview 

 

5 Comments
Version history
Last update:
‎Mar 07 2022 03:45 AM
Updated by: