Table count from custom log

Copper Contributor

Hello,

 

I would like to get the count of table name reference in my custom log which is pushed to the Log analytics. The custom log structure is given below.

 

020-08-31 16:15:38 ProxyEngine [INFO] api invoked by user abcd with db user as default api user
2020-08-31 16:15:38 JdbcTemplateService [DEBUG] getting basic api user userapi
2020-08-31 16:15:38  SqlTrigger [INFO] sql fired is select * from table_1
2020-08-31 16:15:39  SqlTrigger [INFO] total records found 301
2020-08-31 16:15:39  SqlTrigger [INFO] done fetching of data

2020-08-31 16:15:50 ProxyEngine [INFO] api invoked by user xyz with db user as default api user
2020-08-31 16:15:50 JdbcTemplateService [DEBUG] getting basic api user userapi
2020-08-31 16:15:50  SqlTrigger [INFO] sql fired is select * from table_2
2020-08-31 16:15:51  SqlTrigger [INFO] total records found 305
2020-08-31 16:15:51  SqlTrigger [INFO] done fetching of data

2020-08-31 16:16:02 ProxyEngine [INFO] api invoked by user abcd with db user as default api user
2020-08-31 16:16:02 JdbcTemplateService [DEBUG] getting basic api user userapi
2020-08-31 16:16:02  SqlTrigger [INFO] sql fired is select * from table_1
2020-08-31 16:16:34  SqlTrigger [INFO] total records found 301
2020-08-31 16:16:34  SqlTrigger [INFO] done fetching of data

 

The expected output 

 

Table namecount
table_12
table_21

 

Can someone provide the KQL query to get the above output? thanks in advance.

 

 

23 Replies

Hi @alchem_rj,

You can use "parse-where" like I did here. The short syntax is 

...| parse-where LogText with * "from " table_name
| summarize count() by table_name​

where LogText is the name of the field you analyze.

The command is documented here.

Thanks a lot, @Noa Kuperberg for the suggestion

Hi @Noa Kuperberg ,

 

One more help is it possible to filter count based on the month using 

 
the bin(TimeGenerated, 30day) will help?

 

TablenameJuly countAugust Count
table_132
table_2210

 

 

@alchem_rj yes, but it will be displayed a bit differently. Easiest is to just add a month column - Run this to play with it (I'm assuming the datetime information is ingested to the TimeGenerated field, which is typically what happens). The query text is:

...
| parse-where LogText with * "from " table_name
| extend month=getmonth(TimeGenerated)
| summarize count() by month, table_name

and in ithis case the results would be:

month table_name count_
8 table_1 2
8 table_2 1

To create a column for each month, append "| evaluate pivot(month)" to the query and you'll get:

table_name count_ 8 7
table_1 2 1 ...
table_2 1 1 ...

 Of course with more data you'll have additional columns for the other months.

Thanks @Noa Kuperberg ,

 

Yes,  the datetime information is ingested to the TimeGenerated field,

I tried with "evaluate pivot(month)" function but am getting some random value like "1" and "0" as output.

 

 

| parse RawData with * "from (" from "from " Tablename "where" clause
| extend month=getmonth(TimeGenerated)
| summarize Count=count()by month, Tablename 
| evaluate pivot(month)
| order by Count desc

 

 

Also, the link you provided to play with am getting below error.

 

Query could not be parsed at ',' on line [1,21] Token: , Line: 1 Position: 21

 

Thanks

Hi @alchem_rj ,

 

Try to run it without summarizing before pivot:

 

someCustomLogs 
| parse-where LogText with * "from " TableName
| extend month=getmonth(TimeGenerated)
| project TableName, month
| evaluate pivot(month)

 

The reason that you've got the error in the link is that the 'let' statement which defines the data table somehow wasn't complete, try it again: Go to Log Analytics and run query

 

Rafi

 

 

Hi @Rafi_Rabo 

 

Thanks a lot for your help.

 

The query you have provided is working fine. But totalcount and Ordering is not working. Is it possible to include Totalcount and Descending order in the query?

 

Thanks

@alchem_rj 

 

Here is a straight-forward way of doing it:

1. Sort columns with project-reorder operator (had to manipulate the months to be from type string in order to enable sorting properly).

project-reorder documentation: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/projectreorderoperator

2. Use join to combine the results from total aggregation, and the part where you split by month (I believe that there is a simpler whey to implement this)

join documentation: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/joinoperator?pivots=azuredataexplor...

 

let TotalCount = someCustomLogs 
| parse-where LogText with * "from " table_name
| summarize total = count() by table_name;
someCustomLogs 
| parse-where LogText with * "from " table_name
| extend month=(getmonth(TimeGenerated))
| project table_name, month_str = iif(month<10, strcat('0',tostring(month)), tostring(month)) // Manipulating month into string to sort it with project-reorder
| evaluate pivot(month_str)
| join (TotalCount) on table_name // Adding total count column - combine tables according to table_name
| project-away table_name1
| project-reorder table_name, total, * asc // Sort columns
| order by table_name asc // Sort rows

 

Rafi 

Hi @Rafi_Rabo ,

 

I tried the given query for 1 month (8th month) and it worked. But when I added a log snippet for 7th month the output is wrong.

 

 

Screen Shot 2020-09-08 at 8.53.04 AM.jpg

I added logs for the 6th month as well but now showing the below error.

 

Log analytics 

 

Thanks

Hi @alchem_rj,

 

Try it here (I added some months): Go to Log Analytics and run query

 
 

 

 

I'm not sure why the results when you added log for the 7th month are wrong, I need more context here (maybe add a link to the query and explain why the result is wrong?).

 

Please attach the link to the error again (the current link does not show any error), or just write it here.

 

Rafi

 

Thanks @Rafi_Rabo!

The query itself is valid.

Is it possible that the inegested data has a different format that what you (customer) used on your original question? TimeGenerated should be in UTC. A different formatting may cause "getmonth" to fail or return a wrong value. 

Hi @Rafi_Rabo ,

 

Thanks a lot, it worked fine.

 

We are trying to extend this query to include a rawdata that contains the string "from" two times?

An example SQL log snippet is given below.

 

2020-08-31 16:15:38  SqlTrigger [INFO] sql fired is select * from value(sum 5s) as from table_1
2020-08-31 16:15:38  SqlTrigger [INFO] sql fired is select * from value(sum 5s) as from table_1
2020-08-31 16:15:38  SqlTrigger [INFO] sql fired is select * from value(sum 5s) as from table_1
2020-08-31 16:15:50  SqlTrigger [INFO] sql fired is select * from table_2

 

Expected output

table_nametotal_count08
table_133
table_211

 

 

Thank

Rinshad

@alchem_rj 

 

You can use the extract() function to identify the desired pattern ("table_" + <number> in this case), and just filter out the rows without match.

Link to documentation: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/extractfunction

 

let TotalCount = someCustomLogs
| extend table_name = (extract ("table_([0-9]+)", 0, LogText))
| where isnotempty(table_name)
| summarize total = count() by table_name;
someCustomLogs 
| extend table_name = (extract ("table_([0-9]+)", 0, LogText)), month=(getmonth(TimeGenerated))
| where isnotempty(table_name)
| project table_name, month_str = iif(month<10, strcat('0',tostring(month)), tostring(month)) // Manipulating month into string to sort it with project-reorder
| evaluate pivot(month_str)
| join (TotalCount) on table_name // Adding total count column - combine tables according to table_name
| project-away table_name1
| project-reorder table_name, total, * asc // Sort columns
| order by table_name asc // Sort rows

 

Rafi

Hi @Rafi_Rabo,

 

How we can do a string regression. For example, if the table name is as follows then how we can include it in the query.

 

1. abc_3val_xyz_dd

2. txt_1data_abcd

 

Thanks

Hi @alchem_rj ,

 

I can think of several options:

1. Look for more patterns in the the extract() function (You may use more complex RegExp to match more tables). Note that this is not scalable solution since you'll always need to add more patterns/ table names to the expression.

| extend table_name = (extract ("(table_[0-9]+)|(abc_3val_xyz_dd)|(txt_1data_abcd)", 0, LogText)), month=(getmonth(TimeGenerated))

2. Add a 'TableName' column to your custom logs, with the table name in the relevant rows. Easily you'll be able to filter out logs which are not related to tables, and summarize according to 'TableName'.

3. Write the message in your logs with a unique pattern for table, for example

"....... from .... is select * from table table_2"
In such case you could parse according to the unique pattern "from table ".
 
Hope that it answers your question,
Rafi
 

Hi @Rafi_Rabo,

 

 

Do we have RegExp similar to below?

 

1. abc_[0-9]val_* (all string or word )after this?

Expected table names

abc_1val_dd

abc_3val_xx

abc_2val_extra_cc

 

Thanks

 

Hi @alchem_rj ,

 

The following should work:  abc_[0-9]+val_.+

 

Here is a link to the regular expression syntax supported by Kusto:   https://github.com/google/re2/wiki/Syntax 

 

Rafi

Hi @Rafi_Rabo ,

 

When I use the Regex you have provided I am getting all the value in the RawData along with the table name. I am not able to filter the exact table name from the RawData.

Hi @alchem_rj ,

 

Try running the following: Go to Log Analytics and run query

 

I tweaked the regexp to: 

abc_[0-9]+val_[0-9A-Za-z_]+

 

If you still have issues, please share the data and query you are trying to run.

 

Rafi