split and regex in Kusco

Occasional Contributor

Hi all,


I have a query in Kusto to return Details from Table which returns multiple rows of sentence text:

| project Details


Starting cycle 20349
Starting scheduling for cycle 20350


But I want to split the sentences by spaces and remove the numbers (so I can do aggregation on keywords)

The split example in the help is on string literals so I can do this:


| take 10
| project split(Details, ' ')

but I then get an array of values in each row as output:


Row 1

Row n...

How can I split multiple lines and get a row for each word in Kusto syntax?



6 Replies

here's an example for using "split()" (depending on your actual use case, operators/functions linked-to below may be relevant as well):


split() function: 

parse operator:

reduce operator:

replace() function: (for removing numeric values you mentioned was part of your goal)


MyTable // which has a column named "Details"
| extend Details = split(MyColumn, " ") // This assumes the "words" are separated by a single space

Hi @Yoni 


Thanks for your reply, I was adding more text to the post when I got your reply.


I used a similar format but that gives rows containing an array of the split words. I need a row for each word or the syntax to unpack the arrays so that each member is a row.



you should look into using the mv-expand operator:



Cool, mvexpand does the trick. 


How can I perform further operations in the query on the expanded Keywords:


Table // which has a column named "Details"
| project KeyWords = split(Details, " ")
| mv-expand KeyWords


.... some way of referencing the Keywords output here


I tried some permutations but now the output from mv-expand is dynamic which fails to accept additional operations.


i'm not sure i understand the question - so i usually find an example to be helpful - feel free to "manipulate" this to the form which demonstrates your intention:


let Table = datatable(Details:string)
    "123 456 789 abc def",
    "The quick brown fox jumps over the lazy dog"
| project KeyWords = split(Details, " ")
| mv-expand KeyWords to typeof(string)
| where KeyWords matches regex @"^\d+$"

-> this will return:

best response confirmed by marked-data (Occasional Contributor)



Hey Yoni,


Thanks for the example. I did get what I needed by building off your example: This query takes the log lines, splits into words, screens out numbers. and summarizes the frequency of the word occurrence.


Table // which has a column named "Details"| project KeyWords = split(Details, " ")
| mv-expand KeyWords 
| where KeyWords matches regex @"^[a-zA-Z]+$" 
| summarize count() by tostring(KeyWords)