Forum Discussion
Machine not sedning pings
You said you had reverted, to not suing the IN and !in so I didn't reply again. Is the original query not working?
CliveWatson I think the query isn't working properly because
Heartbeat hour to monitor
| where TimeGenerated > ago(24h)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (07 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment
Because I was wondering it has been two days and I haven't recived a single alert for machine not sending pings.
I run another query to see, if we had any machines that were not pinging and there is one at 8:00am, which I didn't got alert about
Can you please have a look at my query again
- Arslan11May 07, 2020Brass Contributor
CliveWatson Thanks for all the help you gave me and keeping up with me, my query is finally working
And it is doing the right thing, excluding those machines and I will see if I don't get alert tonight that means it is also avoiding the ones which shutdown at night at 10:00 pm.
As you described - let start =Hour 7 when the machines are started and 10:00pm when machines are stopped.
let startHour = 07; // 7am let endHour = 22; // 10pm
I have also removed the last line, as it was used for testing the query
| where LastCall < ago(10m)
Thanks, finally getting the logic
- CliveWatsonMay 07, 2020Former Employee
Like this maybe?
// please add a list of your servers here, these ones are the ones that are *shutdown* overnight let shutdownComputers = dynamic(["rancher-node-1","rancher-node-2","rancher-node-3"]); // always exclude these computera let excludeComputers = dynamic(["demo1","demo2","demo3","node-4"]); // config the hours to exclude let startHour = 07; // 7am let endHour = 22; // 10pm Heartbeat // Get just the excluded Servers | where TimeGenerated > startofday(ago(1d)) | where Computer in (shutdownComputers) | summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) ) by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment | where isnotempty(LastCall) | project Computer , LastCall, sComputer // Now join those excluded servers with the others... | join kind= fullouter ( Heartbeat | where TimeGenerated > startofday(ago(1d)) | where Computer !in (shutdownComputers) and Computer !in(excludeComputers) | summarize LastCall = arg_max(TimeGenerated,*) by Computer ) on Computer // This bit can probably be improved if I get time | extend Computer = iif(isempty(Computer),Computer1,Computer), LastCall = iif(isempty(LastCall),LastCall1,LastCall) | summarize by LastCall, Computer, sComputer | where LastCall < ago(10m)// please add a list of your servers here, these ones are the ones that are *shutdown* overnightlet shutdownComputers = dynamic(["rancher-node-1","rancher-node-2","rancher-node-3"]);// always exclude these computerslet excludeComputers = dynamic(["demo1","demo2","demo3","node-4"]);...
...
Heartbeat| where TimeGenerated > startofday(ago(1d))| where Computer !in (shutdownComputers) and Computer !in(excludeComputers)| summarize LastCall = arg_max(TimeGenerated,*) by Computer - Arslan11May 07, 2020Brass Contributor
CliveWatson Prefect, KQL working as expected, Final thing to be done, then it's all done.
All the machines specified in the screenshot, is stopped forever, how can i stop those reporting in my existing query
// config the hours to exclude let startHour = 06; let endHour = 22; Heartbeat // Get just the excluded Servers | where TimeGenerated > startofday(ago(24h)) | where Computer in (shutdownComputers) | summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) ) by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment | where isnotempty(LastCall) | project Computer , LastCall, sComputer // Now join those excluded servers with the others... | join kind= fullouter ( Heartbeat | where TimeGenerated > startofday(ago(24h)) | summarize LastCall = arg_max(TimeGenerated,*) by Computer ) on Computer // This bit can probably be improved if I get time | extend Computer = iif(isempty(Computer),Computer1,Computer), LastCall = iif(isempty(LastCall),LastCall1,LastCall) | summarize by LastCall, Computer, sComputer | where LastCall < ago(10m)Should I add another joinkind= fulloter
then add this
Heartbeat
| where TimeGenerated > ago(24h)
| where Computer != "computer to be excluded"
// or Computer != "aaaa"
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment
| where LastCall < ago(10m)
or there is any other way to do it, final thing to be done.
- CliveWatsonMay 07, 2020Former Employee
So the requirements are:
- I would like to know, if any machine is not sending pings: All Computers
- except the machines that shut down at 10:00pm and start at 6:00am, See list
- it should still report if not sending pings between 7:00 am to 9:00p
So for #3, is that all machines, including those excluded by #2?
The Query returns all servers, and the last record received (unless they are excluded within certain hours).
Have you added this back as the last line?| where LastCall < ago(10m) - Arslan11May 06, 2020Brass Contributor
CliveWatsonI was unable to send private message, that's why I have put it over here
Sorry for confusing you, what I wanted exactly in my query to be set up as alert.
I would like to know, if any machine is not sending pings, expect machines that shut down at 10:00pm and start at 6:00am, but it should still report if not sending pings between 7:00 am to 9:00pm.
Machines that shut down.
https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/rg-vm_ccwallboard-prod-1/providers/Microsoft.Compute/virtualMachines/NET-CCWALLBOARD1https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/RG-VM_FS3-PROD-1/providers/Microsoft.Compute/virtualMachines/Net-fs3https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/RG-VM_GISAPP-PROD-1/providers/Microsoft.Compute/virtualMachines/NET-GISAPP1https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/RG-VM_GISSQL-PROD-1/providers/Microsoft.Compute/virtualMachines/NET-GISSQL1https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/rg-vm_ovuat-prod-1/providers/Microsoft.Compute/virtualMachines/NET-OVUAT2https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/RG-VM_P2PTESTAPP-PROD-1/providers/Microsoft.Compute/virtualMachines/NET-P2PTESTAPP1https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/RG-VM_AAHW-PROD-1/providers/Microsoft.Compute/virtualMachines/NH-AAHW2https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/rg-vm_adappp-prod-1/providers/Microsoft.Compute/virtualMachines/NH-ADAPPP-02https://portal.azure.com/#@networkhomes.org.uk/resource/subscriptions/206bebf0-39bd-4a14-a394-f426cf0f34c8/resourceGroups/rg-vm_cmvmaaz-prod-1/providers/Microsoft.Compute/virtualMachines/NH-CMVMAAZBut, the query is really confusing, it is displaying several machines, which should not be as those machines are turned on and sending pings.
Query let shutdownComputers = dynamic(["NET-CCWALLBOARD.networkhg.org.uk","NET-FS3.networkhg.org.uk","NET-GISAPP1.networkhg.org.uk","NET-GISSQL1.networkhg.org.uk","NET-OVUAT2.networkhg.org.uk","NET-P2PTESTAPP1.networkhg.org.uk"]); // config the hours to exclude let startHour = 06; let endHour = 22; Heartbeat // Get just the excluded Servers | where TimeGenerated > startofday(ago(1h)) | where Computer in (shutdownComputers) | summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) ) by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment | where isnotempty(LastCall) | project Computer , LastCall, sComputer // Now join those excluded servers with the others... | join kind= fullouter ( Heartbeat | where TimeGenerated > startofday(ago(1h)) | summarize LastCall = arg_max(TimeGenerated,*) by Computer ) on Computer // This bit can probably be improved if I get time | extend Computer = iif(isempty(Computer),Computer1,Computer), LastCall = iif(isempty(LastCall),LastCall1,LastCall) | summarize by LastCall, Computer, sComputerResults
- CliveWatsonMay 06, 2020Former EmployeeHello, You only needed to change line 1, not the 2nd to last line as well. I cannot tell what is not working without the results or error. This thread is probably getting too long. Maybe private message me the results, screenshot or csv file?
- Arslan11May 06, 2020Brass Contributor
CliveWatson I did query accroding to my need.
Still not working, please let me know, where I went wrong.
// please add a list of your servers here, these ones are the ones that are *shutdown* overnight
let shutdownComputers = dynamic(["NET-CCWALLBOARD.networkhg.org.uk","NET-FS3.networkhg.org.uk","NET-GISAPP1.networkhg.org.uk","NET-GISSQL1.networkhg.org.uk","NET-OVUAT2.networkhg.org.uk","NET-P2PTESTAPP1.networkhg.org.uk"]);// config the hours to excludelet startHour = 22;let endHour = 06;Heartbeat// Get just the excluded Servers| where TimeGenerated > startofday(ago(1d))| where Computer in (shutdownComputers)| summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) )by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment| where isnotempty(LastCall)| project Computer , LastCall, sComputer// Now join those excluded servers with the others...| join kind= fullouter(Heartbeat| where TimeGenerated > startofday(ago(1d))| where Computer !in (shutdownComputers)| summarize LastCall = arg_max(TimeGenerated,*) by Computer) on Computer// This bit can probably be improved if I get time| extend Computer = iif(isempty("NH-CMVMAAZ.networkhg.org),),LastCall = iif(isempty(LastCall),LastCall1,LastCall)| summarize by LastCall, Computer, sComputer - CliveWatsonMay 06, 2020Former Employee
I think I'm understanding your requirements a bit more now. This now does the work in two phases, the first part deals with the shutdown servers in the time windows you specified. I then join those with all the other servers, to show the lastCall for both (but none of the ones in the shutdown window). I that right? Please test and adjust the KQL yourself to suit your expected outcome.
// please add a list of your servers here, these ones are the ones that are *shutdown* overnight let shutdownComputers = dynamic(["rancher-node-1","rancher-node-2","rancher-node-3"]); // config the hours to exclude let startHour = 07; // 7am let endHour = 22; // 10pm Heartbeat // Get just the excluded Servers | where TimeGenerated > startofday(ago(1d)) | where Computer in (shutdownComputers) | summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) ) by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment | where isnotempty(LastCall) | project Computer , LastCall, sComputer // Now join those excluded servers with the others... | join kind= fullouter ( Heartbeat | where TimeGenerated > startofday(ago(1d)) | where Computer !in (shutdownComputers) | summarize LastCall = arg_max(TimeGenerated,*) by Computer ) on Computer // This bit can probably be improved if I get time | extend Computer = iif(isempty(Computer),Computer1,Computer), LastCall = iif(isempty(LastCall),LastCall1,LastCall) | summarize by LastCall, Computer, sComputerGo to Log Analytics and run query
- Arslan11May 06, 2020Brass Contributor
CliveWatson Not an alert, just a query that I run to see if there were any machines that weren't sending the pings , and one machine came up at this time.
Can you please have a look at this query again, I still want to be alerted about other machines which is not sending the pings, expect the one's which get's turn off at 10:00 pm and turn back on at 6:00 am as shown in the query below, which you helped
Heartbeat existing query
Heartbeat
| where TimeGenerated > ago(24h)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk" and Computer == "NH-AAHW2.networkhg.org.uk" and Computer == "NH-ADAPPP-02.networkhg.org.uk" and Computer == "VM-WVD-REL86-0.networkhg.org.uk" and Computer == "VM-WVD-REL86-1.networkhg.org.uk" and Computer == "VM-WVD-REL86-2.networkhg.org.uk" and Computer == "VM-WVD-REL86-3.networkhg.org.uk" and Computer == "VM-WVD-REL86-4.networkhg.org.uk"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (06 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment - CliveWatsonMay 06, 2020Former EmployeeThe screen shot shows two servers, one is at 8:56 is that the one, you say is 8am? If the query is working, it may be the Alert that isn't setup right? Is this an Azure Monitor alert?