Mar 18 2019 10:17 AM - edited Mar 18 2019 10:47 AM
Hi all,
I have a query in Kusto to return Details from Table which returns multiple rows of sentence text:
Table
| project Details
Output:
Starting cycle 20349
Starting scheduling for cycle 20350
But I want to split the sentences by spaces and remove the numbers (so I can do aggregation on keywords)
The split example in the help is on string literals so I can do this:
Table | take 10 | project split(Details, ' ')
but I then get an array of values in each row as output:
Row 1
[
"Starting",
"cycle",
"20349"
]
Row n...
[
"Starting",
"scheduling",
"for",
"cycle",
"20350"
]
How can I split multiple lines and get a row for each word in Kusto syntax?
Thanks!
Mar 18 2019 10:24 AM
here's an example for using "split()" (depending on your actual use case, operators/functions linked-to below may be relevant as well):
split() function: https://docs.microsoft.com/en-us/azure/kusto/query/splitfunction
parse operator: https://docs.microsoft.com/en-us/azure/kusto/query/parseoperator
reduce operator: https://docs.microsoft.com/en-us/azure/kusto/query/reduceoperator
replace() function: https://docs.microsoft.com/en-us/azure/kusto/query/replacefunction (for removing numeric values you mentioned was part of your goal)
MyTable // which has a column named "Details" | extend Details = split(MyColumn, " ") // This assumes the "words" are separated by a single space
Mar 18 2019 10:51 AM
Hi @Yoni
Thanks for your reply, I was adding more text to the post when I got your reply.
I used a similar format but that gives rows containing an array of the split words. I need a row for each word or the syntax to unpack the arrays so that each member is a row.
Thanks
Mar 18 2019 10:52 AM
you should look into using the mv-expand operator: https://docs.microsoft.com/en-us/azure/kusto/query/mvexpandoperator
Mar 19 2019 11:53 AM
Cool, mvexpand does the trick.
How can I perform further operations in the query on the expanded Keywords:
Table // which has a column named "Details"
| project KeyWords = split(Details, " ")
| mv-expand KeyWords
.... some way of referencing the Keywords output here
I tried some permutations but now the output from mv-expand is dynamic which fails to accept additional operations.
Mar 19 2019 01:17 PM
i'm not sure i understand the question - so i usually find an example to be helpful - feel free to "manipulate" this to the form which demonstrates your intention:
let Table = datatable(Details:string) [ "123 456 789 abc def", "The quick brown fox jumps over the lazy dog" ] ; Table | project KeyWords = split(Details, " ") | mv-expand KeyWords to typeof(string) | where KeyWords matches regex @"^\d+$"
-> this will return:
KeyWords
--------
123
456
789
Mar 20 2019 07:26 AM
Solution
Hey Yoni,
Thanks for the example. I did get what I needed by building off your example: This query takes the log lines, splits into words, screens out numbers. and summarizes the frequency of the word occurrence.
Table // which has a column named "Details"| project KeyWords = split(Details, " ") | mv-expand KeyWords | where KeyWords matches regex @"^[a-zA-Z]+$" | summarize count() by tostring(KeyWords)