Forum Discussion
Availability on OMS
- Feb 13, 2018
Sure. I tweaked it a bit to match what you ask for:
let start_time=startofday(datetime("2017-01-01")); let end_time=endofday(datetime("2017-01-31")); Heartbeat | where TimeGenerated > start_time and TimeGenerated < end_time | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend total_number_of_buckets=round((end_time-start_time)/1h) | extend availability_rate=total_available_hours*100/total_number_of_buckets
The first 2 lines define variables, set to the start and end time you mentioned.
Next, we use these variables to limit the query to that time range:
| where TimeGenerated > start_time and TimeGenerated < end_time
Then we count the heartbeats reported from each computer, in buckets (bins) of 1 hour, starting at the start time you define:
| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer
Now we can see how many heartbeats were reported by each computer each hour. If the number is 0 we understand the computer was probably offline at that time.
We use a new column to mark if a computer was available or not each hour:
| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)
and then count the number of hours each computer was indeed "alive":
| summarize total_available_hours=countif(available_per_hour==true) by Computer
Note that this way we give a little leeway for missing heartbeat reports each hour. Instead of expecting a report every 5 or 10 minutes, we only mark a computer as "unavailable" if we didn't get any report from it during a full hour.
At this point we get a number for each computer, something like this:
So we know each computer was alive 11 hours in the select time range. But what does it mean? how many hours were there altogether? is this 11 out of 11 hours (100% availability) or out of 110 hours (only 10% availability)?
Here's how we can calculate the total number of hours in the selected time range:
| extend total_number_of_buckets=round((end_time-start_time)/1h)+1
I admit it might not be the best calculation of buckets.. there is probably a better way but I can't think of it now..
finally we calculate the ratio between available hours and total hours:
| extend availability_rate=total_available_hours*100/total_number_of_buckets
and get this:
HTH,
Noa
Noa, your script is amazing, however i'm struggling to understand it and tweak it to my needs (30 fixed days, for example from 1st to 31 of january)
Could you gimme a hand to understand it?
let midnight=startofday(now()) ; #First part. I need to change this to between((2018-01-01) .. (2017-01-31)); am I correct? Heartbeat | where TimeGenerated>midnight | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, midnight) #im not sure i understand why do you use bin_at instead of just bin, Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend number_of_buckets=hourofday(now())+1 | extend availability_rate=total_available_hours*100/number_of_buckets
Sure. I tweaked it a bit to match what you ask for:
let start_time=startofday(datetime("2017-01-01")); let end_time=endofday(datetime("2017-01-31")); Heartbeat | where TimeGenerated > start_time and TimeGenerated < end_time | summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer | extend available_per_hour=iff(heartbeat_per_hour>0, true, false) | summarize total_available_hours=countif(available_per_hour==true) by Computer | extend total_number_of_buckets=round((end_time-start_time)/1h) | extend availability_rate=total_available_hours*100/total_number_of_buckets
The first 2 lines define variables, set to the start and end time you mentioned.
Next, we use these variables to limit the query to that time range:
| where TimeGenerated > start_time and TimeGenerated < end_time
Then we count the heartbeats reported from each computer, in buckets (bins) of 1 hour, starting at the start time you define:
| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, start_time), Computer
Now we can see how many heartbeats were reported by each computer each hour. If the number is 0 we understand the computer was probably offline at that time.
We use a new column to mark if a computer was available or not each hour:
| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)
and then count the number of hours each computer was indeed "alive":
| summarize total_available_hours=countif(available_per_hour==true) by Computer
Note that this way we give a little leeway for missing heartbeat reports each hour. Instead of expecting a report every 5 or 10 minutes, we only mark a computer as "unavailable" if we didn't get any report from it during a full hour.
At this point we get a number for each computer, something like this:
So we know each computer was alive 11 hours in the select time range. But what does it mean? how many hours were there altogether? is this 11 out of 11 hours (100% availability) or out of 110 hours (only 10% availability)?
Here's how we can calculate the total number of hours in the selected time range:
| extend total_number_of_buckets=round((end_time-start_time)/1h)+1
I admit it might not be the best calculation of buckets.. there is probably a better way but I can't think of it now..
finally we calculate the ratio between available hours and total hours:
| extend availability_rate=total_available_hours*100/total_number_of_buckets
and get this:
HTH,
Noa
- jeffrey fowelsOct 31, 2018Copper Contributor
Awesome Script Thanks
- Dilip VyasSep 12, 2018Copper Contributor
Can we Availabilty for past 10 days instead of add start date and End date
- Dilip VyasSep 12, 2018Copper Contributor
Thanks but I got the answer
let month = startofday(ago(3d));Heartbeat| where TimeGenerated>ago(3d)| summarize heartbeat_per_hour=count() by bin_at(TimeGenerated, 1h, (ago(3d))), Computer| extend available_per_hour=iff(heartbeat_per_hour>0, true, false)| summarize total_available_hours=countif(available_per_hour==true) by Computer| extend total_number_of_buckets= round((now()-month)/1h)-2| extend availability_rate=total_available_hours*100/total_number_of_buckets
- Prashant SharmaAug 07, 2018Brass ContributorHey Noa,
Can we take the 1 year details by this script.?- GouravINSep 12, 2018Brass Contributor
Hi Prashant,
We can get data by doing amendment in dates. But make sure you have data retention policy for last year to save data.
You can check here:-
- Prashant SharmaSep 12, 2018Brass Contributor
Hi Gaurav,
I got it,
Thank You
- Erik WestergrenFeb 25, 2018Copper Contributor
Hi,
Thanks for a exellent code sample.
I would like to extend the Query, supporting also specified time intervals and smaller uptime checks (heartbeat)
# Service levels
Ex: Service agreements are based on 3 categories
S1 = 07:00 - 17:00 Weekdays
S2 = 07:00- 22:00 Weekdays
365/7 = Always (already supported by your query
= Uptime should be calculated based on service agreement hours/days
Time should also be converted to UTC +1
- will this do the trick = >
Heartbeat| extend Timegenerated = TimeGenerated + 1hI checked the samples from endofday/week, but are unable to get it to work in your sample# Intervalsextend available_per_hour=iff(heartbeat_per_hour>0, true, false)= can this be adjusted to heartbeat per 30 min / 15 minAny ideas ?Brerik- Noa KuperbergApr 08, 2018Microsoft
Hi Eric,
To adjust for the service agreement, you can calculate the start time and end time like this:
let raw_date = datetime("2017-01-01"); let start_date = case("SLA" in ("S1", "S2"), case(dayofweek(raw_date)==0, startofday(raw_date+1d)+7h, dayofweek(raw_date)==6, startofday(raw_date+2d)+7h, startofday(raw_date)+7h), raw_date);
On the intervals - it can adjusted any way you need, just use `bin(fieldname, 30m)` instead of `bin(fieldname, 1h)`.
- George StavarFeb 23, 2018Copper ContributorExcellent query. How can "let midnight=startofday(now())" be altered to make it my local time zone? If I run this as is, it seems to be my time +7, and the amount of hours don't match up.
- Noa KuperbergApr 08, 2018Microsoft
Thanks George.
To adjust for the local time zone you can do this:
let midnight=startofday(now())-7h
- Dante Nahuel CiaiFeb 15, 2018Brass Contributor
Noa, amazing, thank you so much.