Machine not sedning pings

Contributor

Kusto query 

Heartbeat
| where TimeGenerated > ago(24h)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment
| where LastCall < ago(10m

 

I need assistance with this query, I don't want to be reported for the following servers in not sending pings, those severs get shutdown at 10:00pm UK time and starts at 6:00am uk time.

 

I don't want those servers to be reported from 10:00pm to 6:00am, how can I amend my existing query and make this possible

25 Replies

@Arslan11 

 

Look out for a Blog post on KQL and Time from me on the Sentinel blog, hopefully later this week.  Here we get just the "hours" from the TimeGenerated and use that to say, I only want this period of Hours between 07am and 22pm.  Please remove the "hour" column when you are happy this works as expected. 

 

Heartbeat
| where TimeGenerated > ago(1d)
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (07 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment, hour
| where LastCall < ago(10m)
| order by hour asc 

   

@CliveWatsonThanks, you mentioned to remove the hour column, if I will do that, then the hour between will not work, or you want me to still remove it

 

 

Heartbeat
| where TimeGenerated > ago(1d)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
//| where Computer == "demo2"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (07 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment, hour
| where LastCall < ago(10m)
| order by hour asc

 

@Arslan11 

 

Sorry I meant from the Summarize line (you do need it until then), summarize becomes this 

 

| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment

 

I just removed the ", hour" from the end of the line.

@CliveWatsonThanks, Heartbeat
| where TimeGenerated > ago(1d)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (07 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment,
| where LastCall < ago(10m)
| order by hour asc

 

I am getting, after I removed the hour, do I need to put the hour back ?

Query could not be parsed at '|' on line [8,0]

Token: |
Line: 8
Position: 0

 

@CliveWatsonI have amended by query

 

Heartbeat
| where TimeGenerated > ago(24h)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (07 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment

 

removed the hour from the last line, is that what you were asking for

@Arslan11

 

That's correct syntax, it totally up to you to remove the Hour column (it's probably useful when building/testing the query but not after that); your choice.... 

 

If you think it may be useful in the future, you could also comment it out rather than remove it?

 

e.g. 

| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment //, hour

@CliveWatsonjust to have better understanding on my logic.

 

why is it important to have the hour column, is it for testing purposes, when you want to see , which machines are not pinging in that hour and it will show the machines that are switched off, when testing the query

 

@Arslan11 

 

Just for testing, we create it here (line 1 below), in line 2 we use it to further filter the rows returned by the query - in this case those hours that start between 7am and 22pm. 

After that it isn't really needed in the display (optional).  I only added it to the summarise line, so I could check I'd done the query correctly.   You may like to keep it, to check I'm right? 

 

| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (07 .. 22)

@CliveWatsonThanks for the clarification, the query is working as expected 

 

Those machines turned off at 10:00pm and I didn't get machine not sending pings alerts.

 

I will keep the summarise line, for my members of team, if they will run the query , they will be able to see other machines apart from the machines that we do not want to be monitored between 6:00 am and 10:00pm

 

Final query

Heartbeat
| where TimeGenerated > ago(24h)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk" and Computer == "NH-AAHW2.networkhg.org.uk" and Computer == "NH-ADAPPP-02.networkhg.org.uk"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (06 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment, hour

 

@Arslan11 

 

You can also write the query like this (removing lots of the 'and Computer ==')

 

This uses IN and !IN  (in, and 'not in') https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/inoperator

 

Heartbeat
| where TimeGenerated > ago(24h)
| where Computer !in ("NH-CMVMAAZ.networkhg.org.uk","UAT-WVD-REL86-0.networkhg.org.uk")
| where Computer in ("NET-CCWALLBOARD.networkhg.org.uk","NET-FS3.networkhg.org.uk","NET-GISAPP1.networkhg.org.uk","NET-GISSQL1.networkhg.org.uk","NET-OVUAT2.networkhg.org.uk","NET-P2PTESTAPP1.networkhg.org.uk")
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (06 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment  //, hour

 

Other great Resources to read are:

 

Best practise: https://docs.microsoft.com/en-us/azure/data-explorer/kusto/query/best-practices

Prefer using case-sensitive operators when applicable, as they are more performant. For example, prefer using == over =~in over in~, and contains_cs over contains (but if you can avoid contains/contains_cs altogether and use has/has_cs, that's even better).

 

https://docs.microsoft.com/en-us/azure/azure-monitor/log-query/query-optimization

@CliveWatson I tired this using the in and !in, I am afraid, it didn't work. You can see the results, is displaying the machines that are turned on.

 

I revert back to the old query, not using the in and !in

 

Arslan11_0-1588692074767.png

 

@CliveWatson 

 I tired this using the in and !in, I am afraid, it didn't work. You can see the results, is displaying the machines that are turned on.

 

I revert back to the old query, not using the in and !in

 

 

 

 

@CliveWatson  I tired this using the in and !in, I am afraid, it didn't work. You can see the results, is displaying the machines that are turned on.

I revert back to the old query, not using the in and !in

@Arslan11 

 

You said you had reverted, to not suing the IN and !in so I didn't reply again. Is the original query not working?

@CliveWatson I think the query isn't working properly because

 

Heartbeat  hour to monitor

| where TimeGenerated > ago(24h)

| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"

| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk"

| extend hour = datetime_part("hour", TimeGenerated)

| where hour between (07 .. 22)

| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment

 

 

Because I was wondering it has been two days and I haven't recived a single alert for machine not sending pings.

 

I run another query to see, if we had any machines that were not pinging and there is one at 8:00am, which I didn't got alert about

 

Arslan11_0-1588775638318.png

Can you please have a look at my query again

 

 

The screen shot shows two servers, one is at 8:56 is that the one, you say is 8am? If the query is working, it may be the Alert that isn't setup right? Is this an Azure Monitor alert?

@CliveWatson Not an alert, just a query that  I run to see if there were any machines that weren't sending the  pings , and one machine came up at this time.

 

Arslan11_0-1588780305176.png

Can you please have a look at this query again, I still want to be alerted about other machines which is not sending the pings,  expect the one's which get's turn off at 10:00 pm and turn back on at 6:00 am as shown in the query below, which you helped

 

Heartbeat existing query

Heartbeat
| where TimeGenerated > ago(24h)
| where Computer != "NH-CMVMAAZ.networkhg.org.uk" and Computer != "UAT-WVD-REL86-0.networkhg.org.uk"
| where Computer == "NET-CCWALLBOARD.networkhg.org.uk" and Computer == "NET-FS3.networkhg.org.uk" and Computer == "NET-GISAPP1.networkhg.org.uk" and Computer == "NET-GISSQL1.networkhg.org.uk" and Computer == "NET-OVUAT2.networkhg.org.uk" and Computer == "NET-P2PTESTAPP1.networkhg.org.uk" and Computer == "NH-AAHW2.networkhg.org.uk" and Computer == "NH-ADAPPP-02.networkhg.org.uk" and Computer == "VM-WVD-REL86-0.networkhg.org.uk" and Computer == "VM-WVD-REL86-1.networkhg.org.uk" and Computer == "VM-WVD-REL86-2.networkhg.org.uk" and Computer == "VM-WVD-REL86-3.networkhg.org.uk" and Computer == "VM-WVD-REL86-4.networkhg.org.uk"
| extend hour = datetime_part("hour", TimeGenerated)
| where hour between (06 .. 22)
| summarize LastCall = max(TimeGenerated) by Computer, ComputerEnvironment 

@Arslan11 

 

I think I'm understanding your requirements a bit more now.  This now does the work in two phases, the first part deals with the shutdown servers in the time windows you specified.  I then join those with all the other servers, to show the lastCall for both (but none of the ones in the shutdown window).   I that right?  Please test and adjust the KQL yourself to suit your expected outcome.   

 

// please add a list of your servers here, these ones are the ones that are *shutdown* overnight
let shutdownComputers = dynamic(["rancher-node-1","rancher-node-2","rancher-node-3"]);
// config the hours to exclude
let startHour = 07;   // 7am
let endHour   = 22;   // 10pm
Heartbeat
// Get just the excluded Servers
| where TimeGenerated > startofday(ago(1d)) 
| where Computer in (shutdownComputers)
| summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) )
            by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment
| where isnotempty(LastCall)
| project Computer , LastCall, sComputer
// Now join those excluded servers with the others...   
| join kind= fullouter  
 (
    Heartbeat
    | where TimeGenerated > startofday(ago(1d)) 
    | where Computer !in (shutdownComputers)
    | summarize LastCall = arg_max(TimeGenerated,*) by Computer
 ) on Computer
// This bit can probably be improved if I get time  
| extend Computer = iif(isempty(Computer),Computer1,Computer),
         LastCall = iif(isempty(LastCall),LastCall1,LastCall)
| summarize by LastCall, Computer, sComputer

 

Go to Log Analytics and run query

 

@CliveWatson   I did query accroding to my need.

 

Still not working, please let me know, where I went wrong.

 

// please add a list of your servers here, these ones are the ones that are *shutdown* overnight

let shutdownComputers = dynamic(["NET-CCWALLBOARD.networkhg.org.uk","NET-FS3.networkhg.org.uk","NET-GISAPP1.networkhg.org.uk","NET-GISSQL1.networkhg.org.uk","NET-OVUAT2.networkhg.org.uk","NET-P2PTESTAPP1.networkhg.org.uk"]);
// config the hours to exclude
let startHour = 22;
let endHour = 06;
Heartbeat
// Get just the excluded Servers
| where TimeGenerated > startofday(ago(1d))
| where Computer in (shutdownComputers)
| summarize LastCall = arg_max( TimeGenerated, datetime_part("hour", TimeGenerated) between( startHour .. endHour) )
by Computer, sComputer = strcat("Computer in OFFLINE list from ", startHour," to ", endHour," :",Computer), ComputerEnvironment
| where isnotempty(LastCall)
| project Computer , LastCall, sComputer
// Now join those excluded servers with the others...
| join kind= fullouter
(
Heartbeat
| where TimeGenerated > startofday(ago(1d))
| where Computer !in (shutdownComputers)
| summarize LastCall = arg_max(TimeGenerated,*) by Computer
) on Computer
// This bit can probably be improved if I get time
| extend Computer = iif(isempty("NH-CMVMAAZ.networkhg.org),),
LastCall = iif(isempty(LastCall),LastCall1,LastCall)
| summarize by LastCall, Computer, sComputer