Forum Discussion
need to create monitoring queries to track the health status of data connectors
Hello,
What a coincident! Today i updated my own workbook with some tables that have dataconnectors listed as current status and a table next to it with all the events happened the last 30 days (or a what i select in a parameter for the workbook, ie: 1 day, 30 days, 90 days etc). On top of this, i have an analytic rule that creates alert when a connector is reporting unhealthy status. But as others mentioned below (clive) not all connectors talk the right way with the health table.
Analytic rule can be found here, so this is what i started with:
https://learn.microsoft.com/en-us/azure/sentinel/monitor-data-connector-health
So, i worked out my own KQL listed below where i work with the last record it saw and the status changes. Im sure that not all connectors will report health, but i will at least see all events of those who report when last write of data happened, when status changed, and when it changed back - thus also have a column estimating the "disconnected" time (failure duration). So i end up with a list of all events for the month, how long they where out for and when. :) Perfect to have in a workbook!
// Failures started in the last 30 days, with duration (until next Success or ongoing)
// Internetguy441
let endEval = now();
let startEval = ago(30d);
let base = SentinelHealth
| where OperationName == "Data fetch status change"
| where Status in ("Success","Failure");
let failures = base
| where Status == "Failure"
| where TimeGenerated between (startEval .. endEval)
| project Connector = iff(isempty(SentinelResourceName), tostring(SentinelResourceId), SentinelResourceName),
SentinelResourceId,
FailureStart = TimeGenerated;
let recoveries = base
| where Status == "Success"
| project SentinelResourceId, SuccessTime = TimeGenerated;
failures
| join kind=leftouter recoveries on SentinelResourceId
| where isnull(SuccessTime) or SuccessTime > FailureStart
| summarize RecoveryTime = minif(SuccessTime, SuccessTime > FailureStart)
by Connector, SentinelResourceId, FailureStart
| extend State = iff(isnull(RecoveryTime), "Ongoing", "Recovered")
| extend ["Failure duration"] = iff(isnull(RecoveryTime), endEval - FailureStart, RecoveryTime - FailureStart)
| project Connector,
State,
["Failure start [UTC]"] = FailureStart,
["Recovered at [UTC]"] = RecoveryTime,
["Failure duration"]
| order by ["Failure start [UTC]"] descExample output:
Then u can just create a tile or anything else your imagination can muster next to it with another query that look for current status (some might not have reported in awhile so you might need to pair it with "data freshness" (last reported etc).
Just example query you can build on:
// Current status of all connectors that emit SentinelHealth (no lookback limit)
// Internetguy441
SentinelHealth
| where OperationName == "Data fetch status change"
| where Status in ("Success","Failure")
| summarize arg_max(TimeGenerated, *) by SentinelResourceId
| project Connector = iff(isempty(SentinelResourceName), tostring(SentinelResourceId), SentinelResourceName),
["LastChange [UTC]"] = TimeGenerated,
StatusNow = Status,
IsHealthyNow = (Status == "Success")
| order by IsHealthyNow asc, ["LastChange [UTC]"] desc
Hope it helps you (and others) ~~