SOLVED

Loop through the KQL query result

Copper Contributor

Hi ,

 

I need to trigger an alert if windows service is stopped in one of the node.

There are 2 nodes and service will be running in both nodes or at one node .

Only If service is not running in both the node then alert need to be triggered.

 

I'm using the below query and its not right. because alert will be triggered if the service is stopped in one of the node as the query fetches the latest record

 

let status =
Event
| where TimeGenerated > ago (1d)
| where EventLog == 'System' and EventID == 7036 and Source == 'Service Control Manager' and RenderedDescription has "Apache tomcat"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| summarize (TimeGenerated, winstatus) = arg_max(TimeGenerated, Windows_Service_State) by Windows_Service_Name, Computer;
status
| where winstatus != 'running'
| project winstatus, Windows_Service_Name, Computer, TimeGenerated

 

The above query works well if there's only one VM but for multiple VM's it wont work.

 

I tried to count the result if service is stopped in both Vms and alert trigger if count value is 2 but then again in Event logs sometimes there will be only one result (if there's no change in state of event within the time frame that used in query) so this method will not work either.

 

sample result for

Event
| where TimeGenerated > ago (1d)
| where EventLog == 'System' and EventID == 7036 and Source == 'Service Control Manager' and RenderedDescription has "Apache tomcat"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| summarize (TimeGenerated, winstatus) = arg_max(TimeGenerated, Windows_Service_State) by Windows_Service_Name, Computer;

 

6/28/2021, 2:01:55.930 AMApache Tomcat 8.5.58apacheNode1running 
6/28/2021, 1:02:54.257 AMApache Tomcat 8.5.58apacheNode2running 

 

How to loop / check if all the rows that returned for winstatus are != 'running'. 

 

Regards,

Racheal

 

 

6 Replies
maybe add a last line of

| summarize anyif(winstatus !="stopped", true)

@CliveWatson Thanks

 

This command is not clear to me 

because I used,
| summarize anyif(winstatus !="stopped", true) --> returns false // . As per the query i think if status is not equal to stopped in any of the VM then returns true else returns false . this returns false because service is stopped in one of the VM

also checked
| summarize anyif(winstatus !="running", true) -> returns true// . As per the query i think if status is not equal to running in any of the VM then returns true else returns false . this returns true even though the service is running in one of the VM

Here's the VM service status

6/28/2021, 10:00:08.173 AM stopped apacheNode1
6/28/2021, 10:07:53.470 AM running apacheNode2

Modified query

let status =
Event
| where TimeGenerated > ago (1d)
| where EventLog == 'System' and EventID == 7036 and Source == 'Service Control Manager' and RenderedDescription has "Apache"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| summarize (TimeGenerated, winstatus) = arg_max(TimeGenerated, Windows_Service_State) by Windows_Service_Name, Computer
| summarize status= anyif(winstatus != "stopped", true);
status
| where status == 'false'
| project status

HI ,

Noticed that same query sometimes returns true and sometimes returns False.
I think it returns the status from the last record in the result set.
best response confirmed by Racheal2k (Copper Contributor)
Solution

@Racheal2k I think you tried this before? 

let status =
Event
| where TimeGenerated > ago (1d)
| where EventLog == 'System' and EventID == 7036 and Source == 'Service Control Manager'  and RenderedDescription has 'WMI Performance Adapter' //"Apache tomcat"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| summarize count(), (TimeGenerated, winstatus) = arg_max(TimeGenerated, Windows_Service_State) by Windows_Service_Name, Computer;
status
| extend winstatus = iif(winstatus == 'running',1,0)
| summarize sumif(winstatus, winstatus > 0), ComputersOK = make_set_if(Computer, winstatus > 0), ComputerNotOk = make_set_if(Computer, winstatus == 0)
| extend ServiceStatus = iif(sumif_winstatus > 0, "The service is running"," The Service is not runnimg")

 

 

@CliveWatson , Thanks and that worked.

I have tried until 

status
| extend winstatus = iif(winstatus == 'running',1,0) but haven't tried Sumif command :) 

 

Great work ! thanks again

 

Regards,

Racheal

@CliveWatson ,

 

I'm using the below query to trigger alert . 

 

let status =
Event
| where TimeGenerated > ago(30d)
| where EventLog == 'System' and EventID == 7036 and Source == 'Service Control Manager' and RenderedDescription has "PowerCurve - Job Server"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| summarize (TimeGenerated, winstatus) = arg_max(TimeGenerated, Windows_Service_State) by Windows_Service_Name, Computer;
status
| extend winstatus = iif(winstatus == 'running', 1, 0)
| summarize sumif(winstatus, winstatus > 0), ComputersOK = make_set_if(Computer, winstatus > 0), ComputerNotOk = make_set_if(Computer, winstatus == 0)
| extend ServiceStatus = iif(sumif_winstatus > 0, "The service is running"," The Service is not running")
| where sumif_winstatus == 0
| project sumif_winstatus, ComputerNotOk, ComputersOK

 

if no. of result is > 0 then an alert will be triggered.

 

Am facing a weird issue here , if the service is running in one of the VM this query returns null in log analytics logs window which is perfect.

 

But i also receive alert that service is stopped and When i click view 1 results from the alert mail i received 

Racheal2k_1-1625067042260.png

it returns status as 0 which means service is stopped

Racheal2k_2-1625067112798.png

 

but if i execute the query again by selecting it , it returns null.

 

Racheal2k_3-1625067377449.png

 

I don't understand this behavior from Azure. The same query gives different result through alert and when it executed from log analytics log page it gives different answer.

 

Could you help with explaining this?

Regards,

Racheal

 

1 best response

Accepted Solutions
best response confirmed by Racheal2k (Copper Contributor)
Solution

@Racheal2k I think you tried this before? 

let status =
Event
| where TimeGenerated > ago (1d)
| where EventLog == 'System' and EventID == 7036 and Source == 'Service Control Manager'  and RenderedDescription has 'WMI Performance Adapter' //"Apache tomcat"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| summarize count(), (TimeGenerated, winstatus) = arg_max(TimeGenerated, Windows_Service_State) by Windows_Service_Name, Computer;
status
| extend winstatus = iif(winstatus == 'running',1,0)
| summarize sumif(winstatus, winstatus > 0), ComputersOK = make_set_if(Computer, winstatus > 0), ComputerNotOk = make_set_if(Computer, winstatus == 0)
| extend ServiceStatus = iif(sumif_winstatus > 0, "The service is running"," The Service is not runnimg")

 

 

View solution in original post