SOLVED

Parsing XML in Azure Sentinel

Brass Contributor

@CliveWatson I wonder if you can give me some pointers for how to parse XML syslog information in Azure Sentinel?

 

Here is an sample of the redacted syslog message formatted into XML

 

05:19.0Z Some-Server-Name Events - EventFwd [agentInfo@3401 tenantId="0" bpsId="0" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="1\2"] �<?xml version="1.0" encoding="utf-8"?>
<UpdateEvents>
    <MachineInfo>
        <AgentGUID>{00000000-0000-0000-0000-000000000000}</AgentGUID>
        <MachineName>Some-Machine</MachineName>
        <RawMACAddress>112233445566</RawMACAddress>
        <IPAddress>1.1.2.3</IPAddress>
        <AgentVersion>1.2.3.123</AgentVersion>
        <OSName>Windows 41</OSName>
        <TimeZoneBias>-10</TimeZoneBias>
        <UserName>myName</UserName>
    </MachineInfo>
    <BrandCommonUpdater ProductName="Brand Agent" ProductVersion="1.0.0" ProductFamily="AVP">
        <UpdateEvent>
            <EventID>1234</EventID>
            <Severity>0</Severity>
            <GMTTime>2020-00-00T06:41:02</GMTTime>
            <ProductID>SomeName1999</ProductID>
            <Locale>0001</Locale>
            <Error>0</Error>
            <Type>SomeCore</Type>
            <Version>1234.0</Version>
            <InitiatorID>SOMEAGENT3000</InitiatorID>
            <InitiatorType>OnDemand</InitiatorType>
            <SiteName>Some-Server-Name</SiteName>
            <Description>N/A</Description>
        </UpdateEvent>
    </BrandCommonUpdater>
</UpdateEvents> 

Many thanks
8 Replies
The raw string looks like this:
 
05:19.0Z Some-Server-Name Events - EventFwd [agentInfo@3401 tenantId="0" bpsId="0" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="1\2"] <?xml version="1.0" encoding="utf-8"?><UpdateEvents><MachineInfo><AgentGUID>{00000000-0000-0000-0000-000000000000}</AgentGUID><MachineName>Some-Machine</MachineName><RawMACAddress>112233445566</RawMACAddress><IPAddress>1.1.2.3</IPAddress><AgentVersion>1.2.3.123</AgentVersion><OSName>Windows 41</OSName><TimeZoneBias>-10</TimeZoneBias><UserName>myName</UserName></MachineInfo><BrandCommonUpdater ProductName="Brand Agent" ProductVersion="1.0.0" ProductFamily="AVP"><UpdateEvent><EventID>1234</EventID><Severity>0</Severity><GMTTime>2020-00-00T06:41:02</GMTTime><ProductID>SomeName1999</ProductID><Locale>0001</Locale><Error>0</Error><Type>SomeCore</Type><Version>1234.0</Version><InitiatorID>SOMEAGENT3000</InitiatorID><InitiatorType>OnDemand</InitiatorType><SiteName>Some-Server-Name</SiteName><Description>N/A</Description></UpdateEvent></BrandCommonUpdater></UpdateEvents>
 
I have this KQL so far to at leastquery the computer and create a data table of just the Syslog message
 
Syslog
| where Computer contains "Some-Server-Name"
| project SyslogMessage
| extend NewField=parse_xml(SyslogMessage)
 

@TS-noodlemctwoodle Take a look at the parse_xml() command.  Sorry I don't have an example to give you.

 

https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/parse-xmlfunction

@TS-noodlemctwoodle 

 

SecurityEvent
| project EventData
| extend NewField=parse_xml(EventData)
| extend value=NewField.UserData
| where isnotempty(value)
| project value.RuleAndFileData.FilePath

 

I don't have a Syslog example, but this works  

@CliveWatson 

 

Would you be able to assist how I might format your example for SecurityEvent into Syslog using the message example?

 

@Gary Bushey 

I looked at this documentation, although I dont fully understand the examples provided :|

 

I also looked at this post https://www.systemcenterautomation.com/2020/01/extracting-nested-fields-kusto/ but i haven't been able to replicate the output with the data I have

@TS-noodlemctwoodle 

 

One way maybe, if you just need a few fields would be to parse i.e.

 

print syslogmsg = '05:19.0Z Some-Server-Name Events - EventFwd [agentInfo@3401 tenantId="0" bpsId="0" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="1\2"] <?xml version="1.0" encoding="utf-8"?><UpdateEvents><MachineInfo><AgentGUID>{00000000-0000-0000-0000-000000000000}</AgentGUID><MachineName>Some-Machine</MachineName><RawMACAddress>112233445566</RawMACAddress><IPAddress>1.1.2.3</IPAddress><AgentVersion>1.2.3.123</AgentVersion><OSName>Windows 41</OSName><TimeZoneBias>-10</TimeZoneBias><UserName>myName</UserName></MachineInfo><BrandCommonUpdater ProductName="Brand Agent" ProductVersion="1.0.0" ProductFamily="AVP"><UpdateEvent><EventID>1234</EventID><Severity>0</Severity><GMTTime>2020-00-00T06:41:02</GMTTime><ProductID>SomeName1999</ProductID><Locale>0001</Locale><Error>0</Error><Type>SomeCore</Type><Version>1234.0</Version><InitiatorID>SOMEAGENT3000</InitiatorID><InitiatorType>OnDemand</InitiatorType><SiteName>Some-Server-Name</SiteName><Description>N/A</Description></UpdateEvent></BrandCommonUpdater></UpdateEvents>'
| parse syslogmsg with *" EventFwd [" str " tenantId="*
| project str

 

Go to Log Analytics and run query

str
agentInfo@3401

 

Is that whole string syslogmessge like in the above Print statement?

@CliveWatsonYes that is whole string syslogmessge like in the Print statement..

 

Would it be possible for you to show me how to extract the data values after this value

05:19.0Z Some-Server-Name Events - EventFwd [agentInfo@3401 tenantId="0" bpsId="0" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="1\2"] <?xml version="1.0" encoding="utf-8"?>

 

I'm guessing I would need to RegEx out the above header to get to the data values below. Although I am not sure how to proceed with that?

 

<MachineName>Some-Machine</MachineName>
<RawMACAddress>112233445566</RawMACAddress>
<IPAddress>1.1.2.3</IPAddress>
<AgentVersion>1.2.3.123</AgentVersion>
<OSName>Windows 41</OSName>
<TimeZoneBias>-10</TimeZoneBias>
<UserName>myName</UserName>
<EventID>1234</EventID>
<Severity>0</Severity>
<GMTTime>2020-00-00T06:41:02</GMTTime>
<ProductID>SomeName1999</ProductID>
<Locale>0001</Locale>
<Error>0</Error>
<Type>SomeCore</Type>
<Version>1234.0</Version>
<InitiatorID>SOMEAGENT3000</InitiatorID>
<InitiatorType>OnDemand</InitiatorType>
<SiteName>Some-Server-Name</SiteName>
<Description>N/A</Description>

 

 

Many Thanks for your help so far :)

 

best response confirmed by TS-noodlemctwoodle (Brass Contributor)
Solution

@CliveWatsonThank you very much with your help on this, your a legend.

 

Here is the working solution based upon your suggestion :cool:

 

 

 

 

 

print syslogmsg = '05:19.0Z Some-Server-Name Events - EventFwd [agentInfo@3401 tenantId="0" bpsId="0" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="1\2"] <?xml version="1.0" encoding="utf-8"?><UpdateEvents><MachineInfo><AgentGUID>{00000000-0000-0000-0000-000000000000}</AgentGUID><MachineName>Some-Machine</MachineName><RawMACAddress>112233445566</RawMACAddress><IPAddress>1.1.2.3</IPAddress><AgentVersion>1.2.3.123</AgentVersion><OSName>Windows 41</OSName><TimeZoneBias>-10</TimeZoneBias><UserName>myName</UserName></MachineInfo><BrandCommonUpdater ProductName="Brand Agent" ProductVersion="1.0.0" ProductFamily="AVP"><UpdateEvent><EventID>1234</EventID><Severity>0</Severity><GMTTime>2020-00-00T06:41:02</GMTTime><ProductID>SomeName1999</ProductID><Locale>0001</Locale><Error>0</Error><Type>SomeCore</Type><Version>1234.0</Version><InitiatorID>SOMEAGENT3000</InitiatorID><InitiatorType>OnDemand</InitiatorType><SiteName>Some-Server-Name</SiteName><Description>N/A</Description></UpdateEvent></BrandCommonUpdater></UpdateEvents>'
| parse syslogmsg with * " tenantNodePath" * " " xml 
| extend xml=parse_xml(xml)
| extend MachineName =  xml.UpdateEvents.MachineInfo.MachineName
| extend IPAddress =  xml.UpdateEvents.MachineInfo.IPAddress
| where isnotempty(MachineName)
| project 
    MachineName,
    IPAddress

 

 

Edit: Just to clean up the query I have made an adjustment to the solution as suggested by @CliveWatson and Ofer :smile:

 

@TS-noodlemctwoodle 

 

Glad to help, and thanks also to Ofer for the cool use of parse in the example.

1 best response

Accepted Solutions
best response confirmed by TS-noodlemctwoodle (Brass Contributor)
Solution

@CliveWatsonThank you very much with your help on this, your a legend.

 

Here is the working solution based upon your suggestion :cool:

 

 

 

 

 

print syslogmsg = '05:19.0Z Some-Server-Name Events - EventFwd [agentInfo@3401 tenantId="0" bpsId="0" tenantGUID="{00000000-0000-0000-0000-000000000000}" tenantNodePath="1\2"] <?xml version="1.0" encoding="utf-8"?><UpdateEvents><MachineInfo><AgentGUID>{00000000-0000-0000-0000-000000000000}</AgentGUID><MachineName>Some-Machine</MachineName><RawMACAddress>112233445566</RawMACAddress><IPAddress>1.1.2.3</IPAddress><AgentVersion>1.2.3.123</AgentVersion><OSName>Windows 41</OSName><TimeZoneBias>-10</TimeZoneBias><UserName>myName</UserName></MachineInfo><BrandCommonUpdater ProductName="Brand Agent" ProductVersion="1.0.0" ProductFamily="AVP"><UpdateEvent><EventID>1234</EventID><Severity>0</Severity><GMTTime>2020-00-00T06:41:02</GMTTime><ProductID>SomeName1999</ProductID><Locale>0001</Locale><Error>0</Error><Type>SomeCore</Type><Version>1234.0</Version><InitiatorID>SOMEAGENT3000</InitiatorID><InitiatorType>OnDemand</InitiatorType><SiteName>Some-Server-Name</SiteName><Description>N/A</Description></UpdateEvent></BrandCommonUpdater></UpdateEvents>'
| parse syslogmsg with * " tenantNodePath" * " " xml 
| extend xml=parse_xml(xml)
| extend MachineName =  xml.UpdateEvents.MachineInfo.MachineName
| extend IPAddress =  xml.UpdateEvents.MachineInfo.IPAddress
| where isnotempty(MachineName)
| project 
    MachineName,
    IPAddress

 

 

Edit: Just to clean up the query I have made an adjustment to the solution as suggested by @CliveWatson and Ofer :smile:

 

View solution in original post