Syslog parsing issue - extra comma

Iron Contributor

I've been trying to correctly parse a log using Syslog, but I am stuck. One of the fields sometimes contains a comma as part of a message, and this breaks my split() because the comma is the delimiter.

 

My query:

Syslog
| where SyslogMessage contains ",system,"
| extend msgArr=split(SyslogMessage, ",")
| project TimeGenerated,
Description=msgArr[13], // --this field sometimes has a comma in it, which breaks the array (shifts columns)
Action_Flags=msgArr[15],
msgArr

Here's an example of the SyslogMessage column before the split:

 

09:23,0009C103068,SYSTEM,url-filtering,0,2018/08/08 15:09:20,,url-cloud-connection-failure,,0,0,general,medium,"Cloud is not ready, There was no update from the cloud in the last 210470 minutes.",3955511,0x8000000000000000,0,0,0,0,,PA-5050

You can see that there is a column (between commas) that is in quotes--this is a "Description" column from the device. The string contains a comma (after "ready") and this is what causes split() to think this is a delimiter.


Here's an example of the "msgArr" column AFTER the split:

 

["09:23","0009C103068","SYSTEM","url-filtering","0","2018/08/08 15:09:20","","url-cloud-connection-failure","","0","0","general","medium","\"Cloud is not ready"," There was no update from the cloud in the last 210470 minutes.\"","3955511","0x8000000000000000","0","0","0","0","","PA-5050"]

You'll notice, that the split() command inserts a '\' before the string that has the extra comma in it, as well as one after:

 

\"Cloud is not ready"," There was no update from the cloud in the last 210470 minutes.\"

So the question is, how in the heck do I parse out a field where there is an extra comma? Is split() even the right method? Anybody?

 

See original post here: https://techcommunity.microsoft.com/t5/Azure-Log-Analytics/Parsing-comma-separated-values/m-p/218426...

1 Reply

Hi,

I think the best you can do is to remove the comma from the message before using split.

Syslog
| where SyslogMessage contains ",system,"
| extend SyslogMessage  = replace(@',', @' ', SyslogMessage )
| extend msgArr=split(SyslogMessage, ",")
| project TimeGenerated, Description=msgArr[13], Action_Flags=msgArr[15], msgArr

Let me know if this helps!

 

You can also try to replace it with some specific character and after you've used split to use replace again to return the comma in case it is important for the end results.