Questions about ingestion-time data transformation

Copper Contributor

Hi,


We are building a custom collector which collects several sources like ETW, Event Logs, TCP Activities etc (yes, yet another filebeat :)) and normalize the output into ASIM format, by the target schemas of the ASIM tables.

But I see that ingesting directly into the ASIM tables are not allowed via Log Analytics API. In one of the Youtube videos, I heard that support will be there (video is from 3 years ago) but still it's something not supported?

I am a simple minded person. My idea was, if I normalize the data in the same way of ASIM suggest, I can ingest the data into the ASIM tables, so Sentinel can start doing it's magic out-of-box. But from the documentations, I see that normalized data should go into a custom table or (or maybe a standard table) and from there, via unifying parsers, it should go into the ASIM tables? Is that how it works today? Why adding another parser on top of the normalized data?

Thanks in advance.

2 Replies

I can answer my own question after diving deeper into Sentinel's mechanics.


As of September 2024, Sentinel does not support direct ingestion into ASIM tables. The only way to achieve this is by using DCR, which pushes data to custom tables, and through DCR, the data is pushed to ASIM tables. Therefore, a DCR must always be in place to accept data into ASIM.

I hope this changes in the future because if the custom table and ASIM table follow the exact same model, it doesn't make sense to use a "source" transformer.

Sentinel might eventually require a form of Materialized View to ingest data into ASIM tables, but this materialized view could be hardcoded by Sentinel. In this case, if data can fit into that queue table, it can be moved to the main distributed table.

Hey @yusufozturk 

What you are trying to do is supported and it's a great way of getting a subset of events from Big Data systems into Sentinel.  I have been using Azure Data Explorer with a Function App to do this (it's such a great product)!  If anyone was starting out, looking for a way to get rid of Splunk and ELK Stack completely while still having big data capability that can send alerts to Splunk, ADX with DCRs is an exceptional solution.  I honestly don't think there is another columnar database that can compete with ADX on performance / cost / compression.

Anyway - this is what you were hoping to see:

 

Laurie_Rhodes_0-1728286844276.png

Think of the Data Collection Rule as being a pipeline rather than "pushing" data anywhere.  All data is coming into it as JSON (strings and integers) but the pipeline has to transform those data types from to match what's expected by the ASIM schema in Log Analytics.  This stream of data has to labelled as a "custom" stream but this isn't a custom log "_CL"!  Within the DCR Azure Monitor lets you output that data into standard ASIM tables today.

I'll give you an example of one of the ASIM DCR rules (with my subscription data removed of course). 🙂

 

  • Make sure you have granted the Monitoring Metrics Publisher role to the service account you are using to submit data on the DCR.
  • Take note of the name "Custom-ASimFileEventLogs" I've used in the stream declaration as you need to use that name when you submit your events to the DCE.
  • Be patient - it can take up to 3 minutes for the event to arrive in Log Analytics.
  • At the end of this reply I'll post a PowerShell script to test sending events to the DCR

#### JSON Representation of DCR

{
"properties": {
"immutableId": "dcr-XXXXXXXXXXXXXXXXXXXXXXX",
"dataCollectionEndpointId": "/subscriptions/<DCE-SUBSCRIPTION>/resourceGroups/<DCE-RG>/providers/Microsoft.Insights/dataCollectionEndpoints/<DCE-ENDPOINT-ID>",
"streamDeclarations": {
"Custom-ASimFileEventLogs": {
"columns": [
{
"name": "TenantId",
"type": "string"
},
{
"name": "TimeGenerated",
"type": "datetime"
},
{
"name": "EventType",
"type": "string"
},
{
"name": "SourceSystem",
"type": "string"
},
{
"name": "EventMessage",
"type": "string"
},
{
"name": "RuleName",
"type": "string"
},
{
"name": "EventCount",
"type": "int"
},
{
"name": "EventStartTime",
"type": "datetime"
},
{
"name": "EventEndTime",
"type": "datetime"
},
{
"name": "TargetFilePath",
"type": "string"
},
{
"name": "EventSubType",
"type": "string"
},
{
"name": "TargetFilePathType",
"type": "string"
},
{
"name": "EventResult",
"type": "string"
},
{
"name": "TargetFileName",
"type": "string"
},
{
"name": "EventResultDetails",
"type": "string"
},
{
"name": "HashType",
"type": "string"
},
{
"name": "EventOriginalUid",
"type": "string"
},
{
"name": "SrcFileName",
"type": "string"
},
{
"name": "EventOriginalType",
"type": "string"
},
{
"name": "SrcFilePath",
"type": "string"
},
{
"name": "SrcFilePathType",
"type": "string"
},
{
"name": "EventOriginalSubType",
"type": "string"
},
{
"name": "TargetFileCreationTime",
"type": "datetime"
},
{
"name": "EventOriginalResultDetails",
"type": "string"
},
{
"name": "TargetFileDirectory",
"type": "string"
},
{
"name": "EventSeverity",
"type": "string"
},
{
"name": "TargetFileExtension",
"type": "string"
},
{
"name": "EventOriginalSeverity",
"type": "string"
},
{
"name": "TargetFileMimeType",
"type": "string"
},
{
"name": "TargetFileMD5",
"type": "string"
},
{
"name": "EventProduct",
"type": "string"
},
{
"name": "EventProductVersion",
"type": "string"
},
{
"name": "TargetFileSHA1",
"type": "string"
},
{
"name": "TargetFileSHA256",
"type": "string"
},
{
"name": "EventVendor",
"type": "string"
},
{
"name": "TargetFileSHA512",
"type": "string"
},
{
"name": "EventSchemaVersion",
"type": "string"
},
{
"name": "TargetFileSize",
"type": "long"
},
{
"name": "EventOwner",
"type": "string"
},
{
"name": "SrcFileCreationTime",
"type": "datetime"
},
{
"name": "EventReportUrl",
"type": "string"
},
{
"name": "SrcFileDirectory",
"type": "string"
},
{
"name": "RuleNumber",
"type": "int"
},
{
"name": "SrcFileExtension",
"type": "string"
},
{
"name": "ThreatId",
"type": "string"
},
{
"name": "SrcFileMimeType",
"type": "string"
},
{
"name": "ThreatName",
"type": "string"
},
{
"name": "SrcFileMD5",
"type": "string"
},
{
"name": "SrcFileSHA1",
"type": "string"
},
{
"name": "ThreatCategory",
"type": "string"
},
{
"name": "SrcFileSHA256",
"type": "string"
},
{
"name": "ThreatRiskLevel",
"type": "int"
},
{
"name": "SrcFileSHA512",
"type": "string"
},
{
"name": "ThreatOriginalRiskLevel",
"type": "string"
},
{
"name": "SrcFileSize",
"type": "long"
},
{
"name": "ThreatConfidence",
"type": "int"
},
{
"name": "ActingProcessCommandLine",
"type": "string"
},
{
"name": "ThreatOriginalConfidence",
"type": "string"
},
{
"name": "ActingProcessName",
"type": "string"
},
{
"name": "ThreatIsActive",
"type": "string"
},
{
"name": "ActingProcessId",
"type": "string"
},
{
"name": "ThreatFirstReportedTime",
"type": "datetime"
},
{
"name": "ActingProcessGuid",
"type": "string"
},
{
"name": "ThreatLastReportedTime",
"type": "datetime"
},
{
"name": "NetworkApplicationProtocol",
"type": "string"
},
{
"name": "ThreatField",
"type": "string"
},
{
"name": "ThreatFilePath",
"type": "string"
},
{
"name": "DvcIpAddr",
"type": "string"
},
{
"name": "DvcHostname",
"type": "string"
},
{
"name": "DvcDomain",
"type": "string"
},
{
"name": "DvcDomainType",
"type": "string"
},
{
"name": "DvcFQDN",
"type": "string"
},
{
"name": "DvcDescription",
"type": "string"
},
{
"name": "DvcId",
"type": "string"
},
{
"name": "DvcIdType",
"type": "string"
},
{
"name": "DvcMacAddr",
"type": "string"
},
{
"name": "DvcZone",
"type": "string"
},
{
"name": "DvcOs",
"type": "string"
},
{
"name": "DvcOsVersion",
"type": "string"
},
{
"name": "DvcAction",
"type": "string"
},
{
"name": "DvcOriginalAction",
"type": "string"
},
{
"name": "DvcInterface",
"type": "string"
},
{
"name": "DvcScopeId",
"type": "string"
},
{
"name": "DvcScope",
"type": "string"
},
{
"name": "ActorUserId",
"type": "string"
},
{
"name": "ActorUserAadId",
"type": "string"
},
{
"name": "HttpUserAgent",
"type": "string"
},
{
"name": "AdditionalFields",
"type": "string"
},
{
"name": "ActorUserSid",
"type": "string"
},
{
"name": "ActorUserIdType",
"type": "string"
},
{
"name": "ActorScopeId",
"type": "string"
},
{
"name": "ActorScope",
"type": "string"
},
{
"name": "ActorUsername",
"type": "string"
},
{
"name": "ActorUsernameType",
"type": "string"
},
{
"name": "ActorUserType",
"type": "string"
},
{
"name": "ActorOriginalUserType",
"type": "string"
},
{
"name": "ActorSessionId",
"type": "string"
},
{
"name": "TargetAppId",
"type": "string"
},
{
"name": "TargetAppName",
"type": "string"
},
{
"name": "TargetAppType",
"type": "string"
},
{
"name": "TargetOriginalAppType",
"type": "string"
},
{
"name": "TargetUrl",
"type": "string"
},
{
"name": "SrcIpAddr",
"type": "string"
},
{
"name": "SrcPortNumber",
"type": "int"
},
{
"name": "SrcHostname",
"type": "string"
},
{
"name": "SrcDomain",
"type": "string"
},
{
"name": "SrcDomainType",
"type": "string"
},
{
"name": "SrcFQDN",
"type": "string"
},
{
"name": "SrcDescription",
"type": "string"
},
{
"name": "SrcDvcId",
"type": "string"
},
{
"name": "SrcDvcIdType",
"type": "string"
},
{
"name": "SrcDvcScopeId",
"type": "string"
},
{
"name": "SrcDvcScope",
"type": "string"
},
{
"name": "SrcDeviceType",
"type": "string"
},
{
"name": "SrcGeoCountry",
"type": "string"
},
{
"name": "SrcGeoLatitude",
"type": "string"
},
{
"name": "SrcGeoLongitude",
"type": "string"
},
{
"name": "SrcGeoRegion",
"type": "string"
},
{
"name": "SrcGeoCity",
"type": "string"
},
{
"name": "SrcRiskLevel",
"type": "int"
},
{
"name": "SrcOriginalRiskLevel",
"type": "string"
},
{
"name": "SrcMacAddr",
"type": "string"
},
{
"name": "EventSchema",
"type": "string"
}
]
}
},
"destinations": {
"logAnalytics": [
{
"workspaceResourceId": "/subscriptions/<SentinelSubscription>/resourcegroups/<SentinelRG>/providers/microsoft.operationalinsights/workspaces/<SentinelWorkspaceId>",
"workspaceId": "<SentinelWorkspaceId>",
"name": "Sentinel-ASimFileEventLogs"
}
]
},
"dataFlows": [
{
"streams": [
"Custom-ASimFileEventLogs"
],
"destinations": [
"Sentinel-ASimFileEventLogs"
],
"transformKql": "source | project TenantId= toguid(TenantId), TimeGenerated= todatetime(TimeGenerated), EventType, SourceSystem, EventMessage, RuleName, EventCount= toint(EventCount), EventStartTime= todatetime(EventStartTime), EventEndTime= todatetime(EventEndTime), TargetFilePath, EventSubType, TargetFilePathType, EventResult, TargetFileName, EventResultDetails, HashType, EventOriginalUid, SrcFileName, EventOriginalType, SrcFilePath, SrcFilePathType, EventOriginalSubType, TargetFileCreationTime= todatetime(TargetFileCreationTime), EventOriginalResultDetails, TargetFileDirectory, EventSeverity, TargetFileExtension, EventOriginalSeverity, TargetFileMimeType, TargetFileMD5, EventProduct, EventProductVersion, TargetFileSHA1, TargetFileSHA256, EventVendor, TargetFileSHA512, EventSchemaVersion, TargetFileSize= tolong(TargetFileSize), EventOwner, SrcFileCreationTime= todatetime(SrcFileCreationTime), EventReportUrl, SrcFileDirectory, RuleNumber= toint(RuleNumber), SrcFileExtension, ThreatId, SrcFileMimeType, ThreatName, SrcFileMD5, SrcFileSHA1, ThreatCategory, SrcFileSHA256, ThreatRiskLevel= toint(ThreatRiskLevel), SrcFileSHA512, ThreatOriginalRiskLevel, SrcFileSize= tolong(SrcFileSize), ThreatConfidence= toint(ThreatConfidence), ActingProcessCommandLine, ThreatOriginalConfidence, ActingProcessName, ThreatIsActive= tobool(ThreatIsActive), ActingProcessId, ThreatFirstReportedTime= todatetime(ThreatFirstReportedTime), ActingProcessGuid, ThreatLastReportedTime= todatetime(ThreatLastReportedTime), NetworkApplicationProtocol, ThreatField, ThreatFilePath, DvcIpAddr, DvcHostname, DvcDomain, DvcDomainType, DvcFQDN, DvcDescription, DvcId, DvcIdType, DvcMacAddr, DvcZone, DvcOs, DvcOsVersion, DvcAction, DvcOriginalAction, DvcInterface, DvcScopeId, DvcScope, ActorUserId, ActorUserAadId, HttpUserAgent, AdditionalFields= todynamic(AdditionalFields), ActorUserSid, ActorUserIdType, ActorScopeId, ActorScope, ActorUsername, ActorUsernameType, ActorUserType, ActorOriginalUserType, ActorSessionId, TargetAppId, TargetAppName, TargetAppType, TargetOriginalAppType, TargetUrl, SrcIpAddr, SrcPortNumber= toint(SrcPortNumber), SrcHostname, SrcDomain, SrcDomainType, SrcFQDN, SrcDescription, SrcDvcId, SrcDvcIdType, SrcDvcScopeId, SrcDvcScope, SrcDeviceType, SrcGeoCountry, SrcGeoLatitude= todouble(SrcGeoLatitude), SrcGeoLongitude= todouble(SrcGeoLongitude), SrcGeoRegion, SrcGeoCity, SrcRiskLevel= toint(SrcRiskLevel), SrcOriginalRiskLevel, SrcMacAddr, EventSchema",
"outputStream": "Microsoft-ASimFileEventLogs"
}
]
},
"location": "AustraliaEast",
"id": "/subscriptions/<DCRSubscription>/resourceGroups/<DCRRG>/providers/Microsoft.Insights/dataCollectionRules/write-to-ASimFileEventLogs",
"name": "write-to-ASimFileEventLogs",
"type": "Microsoft.Insights/dataCollectionRules"
}
}


####### Testing the DCR

### Testing ASimFileEventLogs

$tenantId = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
$appId = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
$appSecret = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'

$dcrImmutableId = "dcr-XXXXXXXXXXXXXXXXXXXXXX" #the immutableId property of the DCR object
$dceEndpoint = "https://<rhubarb>.australiasoutheast-1.ingest.monitor.azure.com"

$DCRStream = 'Custom-ASimFileEventLogs' # Must match your DCR


#Authenticate for a token

Add-Type -AssemblyName System.Web

$scope= [System.Web.HttpUtility]::UrlEncode("https://monitor.azure.com//.default")


$body = "client_id=$appId&scope=$scope&client_secret=$appSecret&grant_type=client_credentials";
$headers = @{"Content-Type"="application/x-www-form-urlencoded"};
$uri = "https://login.microsoftonline.com/$tenantId/oauth2/v2.0/token"

$bearerToken = (Invoke-RestMethod -Uri $uri -Method "Post" -Body $body -Headers $headers).access_token

# Invent some test data with ChatGPT!

$StartTime = Get-Date ([datetime]::UtcNow) -Format O


$EventData = @"
[ {
"TenantId": "00000000-0000-0000-0000-000000000000",
"SourceSystem": "OpsManager",
"TimeGenerated": "$($StartTime)",
"EventStartTime": "$($StartTime)",
"EventEndTime": "$($StartTime)",
"EventType": "FileAccess",
"SourceSystem": "Windows",
"EventMessage": "User accessed the file example.txt",
"RuleName": "File Access Rule",
"EventCount": 1,
"TargetFilePath": "C:\\Users\\User\\Documents\\example.txt",
"EventSubType": "Read",
"TargetFilePathType": "LocalDisk",
"EventResult": "Success",
"TargetFileName": "example.txt",
"EventResultDetails": "File accessed successfully",
"HashType": "MD5",
"EventOriginalUid": "evt-1234567890",
"SrcFileName": "example.txt",
"EventOriginalType": "FileAccess",
"SrcFilePath": "C:\\Users\\User\\Documents",
"SrcFilePathType": "LocalDisk",
"EventOriginalSubType": "Read",
"TargetFileCreationTime": "2024-10-01T09:00:00Z",
"EventOriginalResultDetails": "Success",
"TargetFileDirectory": "C:\\Users\\User\\Documents",
"EventSeverity": "Low",
"TargetFileExtension": "txt",
"EventOriginalSeverity": "Low",
"TargetFileMimeType": "text/plain",
"TargetFileMD5": "d41d8cd98f00b204e9800998ecf8427e",
"EventProduct": "Windows Defender",
"EventProductVersion": "1.2.3.4",
"TargetFileSHA1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
"TargetFileSHA256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"EventVendor": "Microsoft",
"TargetFileSHA512": "cf83e1357eefb8bdf1542850d66d8007d620e4050b5711bb1fa635bb8a50c0cf1...",
"EventSchemaVersion": "2.0",
"TargetFileSize": 1024,
"EventOwner": "User",
"SrcFileCreationTime": "2024-10-01T09:00:00Z",
"EventReportUrl": "https://example.com/report",
"SrcFileDirectory": "C:\\Users\\User\\Documents",
"RuleNumber": 101,
"SrcFileExtension": "txt",
"ThreatId": "THREAT-12345",
"SrcFileMimeType": "text/plain",
"ThreatName": "FileAccessThreat",
"SrcFileMD5": "d41d8cd98f00b204e9800998ecf8427e",
"SrcFileSHA1": "da39a3ee5e6b4b0d3255bfef95601890afd80709",
"ThreatCategory": "FileAccess",
"SrcFileSHA256": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855",
"ThreatRiskLevel": 2,
"SrcFileSHA512": "cf83e1357eefb8bdf1542850d66d8007d620e4050b5711bb1fa635bb8a50c0cf1...",
"ThreatOriginalRiskLevel": "Medium",
"SrcFileSize": 1024,
"ThreatConfidence": 80,
"ActingProcessCommandLine": "C:\\Windows\\System32\\cmd.exe /c type example.txt",
"ThreatOriginalConfidence": "High",
"ActingProcessName": "cmd.exe",
"ThreatIsActive": "True",
"ActingProcessId": "1234",
"ThreatFirstReportedTime": "2024-10-01T10:00:00Z",
"ActingProcessGuid": "process-1234-5678",
"ThreatLastReportedTime": "2024-10-07T12:30:00Z",
"NetworkApplicationProtocol": "TCP",
"ThreatField": "File Access",
"ThreatFilePath": "C:\\Users\\User\\Documents\\example.txt",
"DvcIpAddr": "192.168.1.10",
"DvcHostname": "User-PC",
"DvcDomain": "WORKGROUP",
"DvcDomainType": "Local",
"DvcFQDN": "User-PC.local",
"DvcDescription": "User's Personal Computer",
"DvcId": "PC-12345",
"DvcIdType": "UUID",
"DvcMacAddr": "00:1A:2B:3C:4D:5E",
"DvcZone": "Internal",
"DvcOs": "Windows 10",
"DvcOsVersion": "10.0.19042",
"DvcAction": "File Read",
"DvcOriginalAction": "Read",
"DvcInterface": "Ethernet",
"DvcScopeId": "Scope-001",
"DvcScope": "Internal Network",
"ActorUserId": "user123",
"ActorUserAadId": "aad-user-12345",
"HttpUserAgent": "Mozilla/5.0",
"AdditionalFields": "{\"CustomField1\": \"Value1\", \"CustomField2\": \"Value2\"}",
"ActorUserSid": "S-1-5-21-1234567890-123456789-1234567890-1001",
"ActorUserIdType": "LocalUser",
"ActorScopeId": "Scope-001",
"ActorScope": "Internal",
"ActorUsername": "user123",
"ActorUsernameType": "Local",
"ActorUserType": "StandardUser",
"ActorOriginalUserType": "StandardUser",
"ActorSessionId": "Session-1234",
"TargetAppId": "App-001",
"TargetAppName": "Text Editor",
"TargetAppType": "Desktop",
"TargetOriginalAppType": "Desktop",
"TargetUrl": "file://C:/Users/User/Documents/example.txt",
"SrcIpAddr": "192.168.1.10",
"SrcPortNumber": 445,
"SrcHostname": "User-PC",
"SrcDomain": "WORKGROUP",
"SrcDomainType": "Local",
"SrcFQDN": "User-PC.local",
"SrcDescription": "User's Personal Computer",
"SrcDvcId": "PC-12345",
"SrcDvcIdType": "UUID",
"SrcDvcScopeId": "Scope-001",
"SrcDvcScope": "Internal Network",
"SrcDeviceType": "Laptop",
"SrcGeoCountry": "Australia",
"SrcGeoLatitude": "-33.8688",
"SrcGeoLongitude": "151.2093",
"SrcGeoRegion": "NSW",
"SrcGeoCity": "Sydney",
"SrcRiskLevel": 1,
"SrcOriginalRiskLevel": "Low",
"SrcMacAddr": "00:1A:2B:3C:4D:5E",
"EventSchema": "ASimFileEventLogs"
}
]
"@

# Send the data to Log Analytics

$headers = @{"Authorization"="Bearer $($bearerToken)";"Content-Type"="application/json"}
$uri = "$($dceEndpoint)/dataCollectionRules/$($dcrImmutableId)/streams/$($DCRStream)?api-version=2021-11-01-preview"

$uploadResponse = Invoke-RestMethod -Uri $uri -Method "Post" -Body $EventData -Headers $headers