Microsoft Secure Tech Accelerator
Apr 03 2024, 07:00 AM - 11:00 AM (PDT)
Microsoft Tech Community

Parsing syslog

Copper Contributor

1. i am ingesting firewall logs as syslog and trying to parse out the fields accordingly using the split command, i have a problem that the beginig of the logs is not piped and i have made the split in 2 occasions.

as you can see in the attached pic the FWD|UDP|p4| fields are nit parsed out.

this is the _raw syslog message:

Security F180 Block: FWD|UDP|p4|192.168.x,x|67|00:15:5d:0f:c4:01|255.255.255.255|68|bootpc||LAN-2-INTERNET|4017|0.0.0.0|0.0.0.0|0|1|

2. can you show me the same using normal regex i cant see in MSFT doc how to do it the old way :)

3. should i do the parsing on search time of the query? doesnt it increase the search time?

 

5 Replies

@omrip  

Options like:

print SyslogMessage = "Security F180 Block: FWD|UDP|p4|192.168.x,x|67|00:15:5d:0f:c4:01|255.255.255.255|68|bootpc||LAN-2-INTERNET|4017|0.0.0.0|0.0.0.0|0|1|"
| project SyslogMessage 
| extend device       = extract("Security (.*?)Block:", 1, SyslogMessage) 
| extend deviceaction = extract("USER=(.*?)COMMAND", 1, SyslogMessage) 

 

or

print SyslogMessage = "Security F180 Block: FWD|UDP|p4|192.168.x,x|67|00:15:5d:0f:c4:01|255.255.255.255|68|bootpc||LAN-2-INTERNET|4017|0.0.0.0|0.0.0.0|0|1|"
| extend p = split(SyslogMessage, "|") 
//| extend cleanp  = trim(@"[^\w]+",tostring(p))
| extend pos1 = split(p.[0], " ")
| extend FWactivity   = trim(@"[^\w]+",tostring(pos1.[0]))
| extend Device       = trim(@"[^\w]+",tostring(pos1.[1]))
| extend DeviceAction = trim(@"[^\w]+",tostring(pos1.[2]))
| extend srcMAC       = trim(@"[^\w]+",tostring(p.[5]))
| extend DestPort     = trim(@"[^\w]+",tostring(p.[4]))
// etc...
| project-away SyslogMessage , p, pos1

 

KQL is good at doing parsing like this at execution time.

@CliveWatson  that is very helpful, tnx

when ingesting the logs to the syslog instead of CEF  connector i am very limited due to the small amount of fileds that exists on the syslog table in comparison with the CEF

how can i overcome it?

 

@omrip 

 

CommonSecurityLog
| getschema 
| summarize count(ColumnName) 

Syslog
| getschema 
| summarize count(ColumnName) 

 

I make the difference 152 vs 15 columns of data,  are there certain columns you are missing in Syslog?  Is the data you require in the Syslog but needs extracting / parsing which is I believe one of the things CEF does for you?   BTW, I'm no expert on CEF or Syslog, but keen to understand your use case.

 

Thanks 

 

 

Parse Examples and you can extend and create new columns

 

https://docs.microsoft.com/en-us/azure/kusto/query/parseoperator

 

 

 

@omrip 

you could use the extract method here is an example

let LogHeader = meraki_CL
| extend Parser = extract_all(@"(\d+.\d+)\s([\w\-\_]+)\s([\w\-\_]+)\s([\S\s]+)$",dynamic([1,2,3,4]),Message)
| mv-expand Parser
| extend Epoch = tostring(Parser[0]),
DeviceName = tostring(Parser[1]),
LogType = tostring(Parser[2]),
Substring = tostring(Parser[3])
| extend EpochTimestamp = split(Epoch,".")
| extend EventTimestamp = unixtime_seconds_todatetime(tolong(EpochTimestamp[0]))
| project-away EpochTimestamp, Parser,Message;
let UrlEvents = LogHeader
| where LogType == "urls"
| extend SrcIpAddr = extract(@"src=([0-9\.]+)\:",1,Substring),
SrcPortNumber = toint(extract(@"src=([0-9\.]+)\:(\d+)\s",2,Substring)),
DstIpAddr = extract(@"dst=([0-9\.]+)\:",1,Substring),
DstPortNumber = toint(extract(@"dst=([0-9\.]+)\:(\d+)\s",2,Substring)),
HttpRequestMethod = extract(@"request: (\w+)\s",1,Substring),
Url = extract(@"request: (\w+)\s(\S+)",2,Substring)
| project-away Substring;