Forum Discussion
Remove duplicates from query
- Dec 01, 2017So does arg_max() does the job? From my understanding is the thing that will work for you.
Hi There is some basics that you need to understand about Log Analytics. Data in Log Analytics is stored with different time stamp (TimeGenerated column). So basically when a solution sends new data it does not delete the old one it just writes a new record with new date. That way you can view historically the data. With that said for example if you want to get the latest record for something there is function for that arg_max(). You can read about it here:
https://docs.loganalytics.io/docs/Language-Reference/Aggregation-functions/arg_max()
I am not sure what you want to achieve but may be it is something like this:
AzureActivity | where OperationName == 'Delete website' and ActivityStatus == 'Succeeded' and ResourceProvider == 'Azure Web Sites' | summarize arg_max(TimeGenerated, *) by CorrelationId
You might need to aggregated on another column or more than one column depending on your scenario.
Thanks Stan, sorry for the bad description.
So my original query to get information about deleted App Services returns multiple events for the deletion of the same App. Though these have the same CorrelationId. Therefore I was trying with my limited knowledge of the query language to only return 1 record that has the same CorrelationId. Though still get all the columns the event contains.
I'm trying to use this to drive a webhook and trigger a runbook to do some house cleaning on both creation and deletion of App Services.
My thinking is that the best way to drive the runbook was to use the query language to filter out all the data that the runbook does not need to make its logic work.
Hopefully this makes more sense?
- Morten LerudjordetDec 01, 2017Copper Contributor
If one just ignores time at the moment this almost gets me there but it will only retain CorrelationId on the result output, but I need to get all the columns (or use project to choose the ones I need)
AzureActivity | where OperationName == 'Delete website' and ActivityStatus == 'Succeeded' and ResourceProvider == 'Azure Web Sites' | summarize by CorrelationId
- Morten LerudjordetDec 01, 2017Copper Contributor
Think I maybe figured it out?
Something like this seems to work
AzureActivity | where OperationName == 'Delete website' and ActivityStatus == 'Succeeded' and ResourceProvider == 'Azure Web Sites' | project ResourceId, CorrelationId, TimeGenerated | summarize arg_max(TimeGenerated, *) by CorrelationId
- Dec 01, 2017Yes. Forgot the '*' in arg_max() sorry. It is best practice to project at the end after summarize
- Dec 01, 2017
Hmm the query you've posted last is different from the one I've posted. It does not contain arg_max(). arg_max() retains all columns. If you do not want to use arg_max() when you get two results what is the difference between them? Which one of the two you want to be displayed only?
- Dec 01, 2017So does arg_max() does the job? From my understanding is the thing that will work for you.