Thank you to my colleague Maria de Sousa-Valadas Castano, Adi Biran, and the Azure Monitor team for assisting in writing this content and demos.
Looking to better manage where logs go when they are ingested? Enter the mutli-destination data collection rule.
Recently, the Azure Monitor team has released a new data collection rule functionality that allows for data ingestion streams to be split into more than one table. This leverages existing functionality and allows opportunities for use cases such as:
This is a great option for SOC teams and organizations that are looking to break out security valuable data from general information that may exist in the same ingestion stream. This also provides additional functionality for log management, as data split out from the main table can be placed into a custom table that has been configured for lower cost ingestion with the basic log tier.
Using with Basic Logs
Once data has been split to go to a custom table, that custom table is eligible to be moved to the basic log tier as the data is being ingested. The process to enable this would be via the documented process. As a reminder, the basic log tier allows for logs to be ingested at $1 per GB vs. the full price in the Analytics log tier. The logs will remain available for querying on-demand for 8 days before the data moves to archive (if configured).
A popular example would be modifying a Syslog ingest stream that contains high volume, low value data to go to a custom table. For example:
Following the process highlighted in the document above, the template is modified to appear as the following:
{
"properties": {
"dataSources": {
"syslog": [{
"streams": [
"Microsoft-Syslog"
],
"facilityNames": [
"local4"
],
"logLevels": [
"Warning",
"Error",
"Critical",
"Alert",
"Emergency"
],
"name": {
"sysLogsDataSource-1688419672"
}]
},
"destinations": {
"logAnalytics": [{
"workspaceResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/my-resource-group/providers/Microsoft.OperationalInsights/workspaces/my-workspace",
"workspaceId": "532cd4d7-b4eb-41ad-80dc-f2f6435094e8",
"name": "myworkspace"
}]
},
"dataFlows": [{
"streams": [
"Microsoft-Syslog"
],
"destinations": [
"myworkspace"
],
"transformKql": "source | where SyslogMessage !has 'Scheduled restart job'",
"outputStream": "Microsoft-Syslog"
},
{
"streams": [
"Microsoft-Microsoft-Syslog"
],
"destinations": [
"myworkspace"
],
"transformKql": "source | where SyslogMessage has 'Scheduled restart job' | extend RawData = SyslogMessage",
"outputStream": "Custom-TransformedSyslog_CL"
}
]
}
}
How It Works:
The new feature uses the following existing components within data collection rules:
The new functionality allows users to configure data collection rules to leverage transformKql as the logic for grabbing the data coming in and leverage outputStream to send that data to a different table. A simplified example looks like:
"dataFlows": [
{
"streams": [
"Custom-MyTableRawData"
],
"destinations": [
"clv2ws1"
],
"transformKql": "source | project TimeGenerated = Time, Computer, SyslogMessage = AdditionalContext",
"outputStream": "Microsoft-Syslog"
},
{
"streams": [
"Custom-MyTableRawData"
],
"destinations": [
"clv2ws1"
],
"transformKql": "source | extend jsonContext = parse_json(AdditionalContext) | project TimeGenerated = Time, Computer, AdditionalContext = jsonContext, ExtendedColumn=tostring(jsonContext.CounterName)",
"outputStream": "Custom-MyTable_CL"
}
]
In this example, data is being ingested into the Syslog table as normal (Microsoft-Syslog). Within the transformKql, the configuration is looking for specific context within the data in order to determine which logs to send to the custom table (Custom-MyTable_CL). By using this flow, it is possible to ingest data into a main table while breaking off data into other specified tables. Syslog is just one of many tables that can benefit from this functionality.
Building It Out
Existing DCR
If modifying an existing DCR:
6. Click the ‘Deploy’ button.
7. Click ‘Edit Template’.
8. Within the body of the JSON, make the changes to split the table.
9. Once done, click ‘done’.
10. Make sure the required information is correct.
11. Click ‘review and create’.
12. Once validation is passed, click ‘create’.
New DCR
If creating a new DCR:
Things to Consider
Types of Data Collection Rules
There are three types of data collection rules today:
WorkspaceTransform rules, also referred to as default rules, are tied to tables that are ingesting data that is not coming from the Azure Monitor Agent. If ingesting data via methods that are not tied to AMA, default DCR’s should be used. The instructions for them can be found here. If ingesting data via AMA, DCR’s created via the wizard in Azure Monitor should be used. Custom log rules are created when establishing a new table within the workspace. These instructions cover how to create one.
Custom Tables
If looking to leverage a custom table as one of the output destinations, the table will need to be created before the table splitting is performed. If attempting to split a table and send data to a custom table that does not exist, the DCR will generate an error upon deployment.
Excluded Tables
You can use most streams in your input/output, but bear in mind the following ones are forbidden:
And that's it. This scenario is just another example of the expanding use case library for AMA and DCRs in combination with Microsoft Sentinel. May this assist in breaking down larger tables and improve cost management/query performance.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.