Forum Discussion
Availability on OMS
- Feb 13, 2018
Sure. I tweaked it a bit to match what you ask for:
let start_time=startofday(datetime("2017-01-01")); let end_time=endofday(datetime("2017-01-31")); Heartbeat | where TimeGenerated > start_time and TimeGenerated < end_time | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend total_number_of_buckets=round((end_time-start_time)/1h) | extend availability_rate=total_available_hours*100/total_number_of_bucketsThe first 2 lines define variables, set to the start and end time you mentioned.
Next, we use these variables to limit the query to that time range:
| where TimeGenerated > start_time and TimeGenerated < end_time
Then we count the heartbeats reported from each computer, in buckets (bins) of 1 hour, starting at the start time you define:
| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer
Now we can see how many heartbeats were reported by each computer each hour. If the number is 0 we understand the computer was probably offline at that time.
We use a new column to mark if a computer was available or not each hour:
| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)
and then count the number of hours each computer was indeed "alive":
| summarize total_available_hours=countif(available_per_hour==true) by Computer
Note that this way we give a little leeway for missing heartbeat reports each hour. Instead of expecting a report every 5 or 10 minutes, we only mark a computer as "unavailable" if we didn't get any report from it during a full hour.
At this point we get a number for each computer, something like this:
So we know each computer was alive 11 hours in the select time range. But what does it mean? how many hours were there altogether? is this 11 out of 11 hours (100% availability) or out of 110 hours (only 10% availability)?
Here's how we can calculate the total number of hours in the selected time range:
| extend total_number_of_buckets=round((end_time-start_time)/1h)+1
I admit it might not be the best calculation of buckets.. there is probably a better way but I can't think of it now..
finally we calculate the ratio between available hours and total hours:
| extend availability_rate=total_available_hours*100/total_number_of_buckets
and get this:
HTH,
Noa
Stanislav is right, it's possible :)
Here's an example that calculates the availability rate of each computer, starting at midnight.
let midnight=startofday(now()); Heartbeat | where TimeGenerated>midnight | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, midnight), Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend number_of_buckets=hourofday(now())+1 | extend availability_rate=total_available_hours*100/number_of_buckets
Run it on our playground and tweak it as makes sense to you.
- ScottAllisonOct 17, 2018Iron Contributor
Love this query. I'm having trouble modifying it to meet my needs.
In addition to what this query provides, I'd also like to show the last TimeGenerated for each Computer. I can't seem to get the logic to work correctly. Any help is appreciated.
- Prashant SharmaApr 26, 2018Brass Contributorit is a pwerShell Script?
- Noa KuperbergApr 26, 2018
Microsoft
- Micah CastorinaAug 07, 2018Copper Contributor
I am struggling to generate the report for Mon-Friday only and in my time zone. I just get errors. The script below works for me. Thanks
let start_time=startofday(datetime("2018-07-1 07:30:00"));let end_time=endofday(datetime("2018-07-31 18:00:00"));Heartbeat| where TimeGenerated > start_time and TimeGenerated < end_time| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)| summarize total_available_hours=countif(available_per_hour==true) by Computer| extend total_number_of_buckets=round((end_time-start_time)/1h)| extend availability_rate=total_available_hours*100/total_number_of_buckets
- Dante Nahuel CiaiFeb 12, 2018Brass Contributor
Noa, your script is amazing, however i'm struggling to understand it and tweak it to my needs (30 fixed days, for example from 1st to 31 of january)
Could you gimme a hand to understand it?
let midnight=startofday(now()) ; #First part. I need to change this to between((2018-01-01) .. (2017-01-31)); am I correct? Heartbeat | where TimeGenerated>midnight | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, midnight) #im not sure i understand why do you use bin_at instead of just bin, Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend number_of_buckets=hourofday(now())+1 | extend availability_rate=total_available_hours*100/number_of_buckets
- Noa KuperbergFeb 13, 2018
Microsoft
Sure. I tweaked it a bit to match what you ask for:
let start_time=startofday(datetime("2017-01-01")); let end_time=endofday(datetime("2017-01-31")); Heartbeat | where TimeGenerated > start_time and TimeGenerated < end_time | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend total_number_of_buckets=round((end_time-start_time)/1h) | extend availability_rate=total_available_hours*100/total_number_of_bucketsThe first 2 lines define variables, set to the start and end time you mentioned.
Next, we use these variables to limit the query to that time range:
| where TimeGenerated > start_time and TimeGenerated < end_time
Then we count the heartbeats reported from each computer, in buckets (bins) of 1 hour, starting at the start time you define:
| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer
Now we can see how many heartbeats were reported by each computer each hour. If the number is 0 we understand the computer was probably offline at that time.
We use a new column to mark if a computer was available or not each hour:
| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)
and then count the number of hours each computer was indeed "alive":
| summarize total_available_hours=countif(available_per_hour==true) by Computer
Note that this way we give a little leeway for missing heartbeat reports each hour. Instead of expecting a report every 5 or 10 minutes, we only mark a computer as "unavailable" if we didn't get any report from it during a full hour.
At this point we get a number for each computer, something like this:
So we know each computer was alive 11 hours in the select time range. But what does it mean? how many hours were there altogether? is this 11 out of 11 hours (100% availability) or out of 110 hours (only 10% availability)?
Here's how we can calculate the total number of hours in the selected time range:
| extend total_number_of_buckets=round((end_time-start_time)/1h)+1
I admit it might not be the best calculation of buckets.. there is probably a better way but I can't think of it now..
finally we calculate the ratio between available hours and total hours:
| extend availability_rate=total_available_hours*100/total_number_of_buckets
and get this:
HTH,
Noa
- jeffrey fowelsOct 31, 2018Copper Contributor
Awesome Script Thanks
- Dante Nahuel CiaiJan 16, 2018Brass Contributor
Thank you for your help, im going to investigate a bit that query.
However, i'm not sure about that approach because the heartbeat happens to stop working a lot even if the VM is perfectly fine.
But I understand the approach...
- Jan 16, 2018Heartbeat should be running without issue however there are scenarios when you might not get data: - Log Analytics service is down - You machine has lost Internet connection - MMA agent service is stopped - MMA agent is not functioning properly
- Dante Nahuel CiaiJan 16, 2018Brass Contributor
The last situation MMA agent not working properly or has stopped working is exactly what worries me in order to create the availability report based on heartbeats.