SOLVED

IoT Hub Device twin returning connectionState = Disconnected, when it's actually connected

Brass Contributor

I have a Monitoring web job that querys my iot devices for any that have a connection state = disconnected, and then sends an email out to the user of that device to let them know it has a connection state issue.  I'm seeing a problem where one device will look like it is down about once every hour (or 1 hour and 5 minutes).  But the device is not down and there is no intermittent connection issue.  I currently have 3 devices, all connected through different internet ISPs.  One system started having this issue every so often, then another did, and now the 3rd one is having this issue.

 

My web job code is pretty simple and runs every 5 minutes to see if there are any devices that are down.  Here's my code

 public static async Task CheckDeviceAvailability([TimerTrigger("00:05:00", RunOnStartup = true)] TimerInfo timer)
        {
            string connectionString = CloudConfigurationManager.GetSetting("IotHubConnectionString");
            try
            {
                RegistryManager manager = RegistryManager.CreateFromConnectionString(connectionString);
                IQuery query = manager.CreateQuery("SELECT * FROM devices where ConnectionState = 'Disconnected'", 10);
                while (query.HasMoreResults)
                {
                    var page = await query.GetNextAsTwinAsync();
                    foreach (Twin twin in page)
                    {
                            await SendNotification(twin.DeviceId);
                    }
                }
            }
            catch (Exception exc)
            {
                Console.WriteLine(exc.Message);
            }
        }

 

I've just added a check (if (twin.ConnectionState == DeviceConnectionState.Disconnected)) above the Await SendNotification line in the above code just to see if this helps out, but it seems like the query shouldn't be returning any devices that are connected.

19 Replies
best response confirmed by OlivierBloch (Microsoft)
Solution

Hi!

You should have a look here: Device heartbeat

 

"

The IoT Hub identity registry contains a field called connectionState. Only use the connectionState field during development and debugging. IoT solutions should not query the field at run time. For example, do not query the connectionState field to check if a device is connected before you send a cloud-to-device message or an SMS. We recommend subscribing to the device disconnected event on Event Grid to get alerts and monitor the device connection state. Use this tutorial to learn how to integrate Device Connected and Device Disconnected events from IoT Hub in your IoT solution.

"

 

Hope that helps!

Thanks

Thanks, I will take a look and make changes accordingly.

The tutorial seems overly complicated.  I don't want to have a cosmos db to store connection state info just to send an email.  

 

Seems like I could create a subscription on the IOT hub and have it put a disconnection/connection message into a queue, and have my webjob listen on this queue for connection state changes and then send out emails.  I can't find much info on this, and only the connection message is listed.  I tried this but my webjob (running locally never got triggered).

 

Any help on this would be greatly appreciated!

 

Thanks

I was able to create events on my Iot hub for device connected and device disconnected, and it sends them to a queue.  I created a webjob that gets triggered by the queue and reads the id and the state from the message, but I only seem to get connected messages, not disconnected messages.  Ie if I unplug my iot device, nothing happens, even if I wait 5+ minutes.  When I plug it back in, I get a device connected message after a minute or so.

 

Any ideas what I might be doing wrong?  Is it something on my device client code?

Hi,

Can you share the snippet of the code? I am not sure how do you manage to send a "disconnected" message from your device to IoTHub if the device is disconnected?

 

Thanks.

I don't send any messages from my device to the iot hub, I want to monitor my devices from the cloud so I can send an alert email when one of them is no longer connected, hence the original webjob that queried the IOT hub for disconnected devices.

 

Since that is supposedly not a proper way, I created event subscriptions in Iot hub to subscribe to the device disconnected and device connected events that are available, and send those messages to a queue.  I created a new webjob that listens on that queue for those messages and then intern, send out an email to the appropriate email address based on the device and the connection state in the event.  The problem with this is that I don't ever seem to get any disconnected events and only get connected events.  Also, I ran this overnight at got many connected events and only one disconnected event, I would expect to get a one to one ratio....and I really should have not gotten any connect/disconnect events.

The goal of this is to monitor for disconnected devices and notify the user of that device that it is offline (or back online).  I cleared out the queue this morning and already have 15 device connected events for 3 different iot devices.  This is problematic as it will send false notifications to something that has not disconnected.

 

I just want to know when a device goes offline, and when it comes back up so that I can notify users of the device in a timely manner and at the same time, not spam them with messages that don't offer any benefit.

 

Does this make sense at what I'm trying to accomplish?  the device twin query that ran every 5 minutes seem to work better than this, only that once in a while a device would report offline once every 1 hour & 5 minutes for what ever reason, giving false issues.  the connected events are showing up way more frequently, and I don't even seem to get a disconnected event even when I pull the network cable for 5 minutes from a test device.

Hi,

 

Like I mentioned above you can't rely on connectionState field (which looks like is what you are using right?).

 

If you don't want to Order device connection events from Azure IoT Hub using Azure Cosmos DB the other supported alternative is to implement the device heartbeat pattern:

 

"

If your IoT solution needs to know if a device is connected, you can implement the heartbeat pattern. In the heartbeat pattern, the device sends device-to-cloud messages at least once every fixed amount of time (for example, at least once every hour). Therefore, even if a device does not have any data to send, it still sends an empty device-to-cloud message (usually with a property that identifies it as a heartbeat). On the service side, the solution maintains a map with the last heartbeat received for each device. If the solution does not receive a heartbeat message within the expected time from the device, it assumes that there is a problem with the device.

 

A more complex implementation could include the information from Azure Monitor and Azure Resource Health to identify devices that are trying to connect or communicate but failing, check Monitor with diagnostics guide. When you implement the heartbeat pattern, make sure to check IoT Hub Quotas and Throttles.

"

 

Yes, you are just restating what is in the documentation.  Doing a heartbeat is worthless as it would require way to many messages just for a heart beat (ie every 5 minutes).  I don't want to know if a device is connected, I want to know when a device is disconnected, if devices are connected, everything is good, when a device disconnects, then there is either a network issue where the device is located, or something with the code on the device crashed or something else causing it not to connect to the cloud.  That is the critical issue and I want to notify the user of the device when this happens so they can validate the device is off line and remedy the situation.

 

I tried again disconnecting my test device from the network, and did eventually receive a disconnected event.  I guess I will modify my webjob and event to only look at/get device disconnected events and send out an alert when that happens.  I don't want to be constantly sending out device is online messages when the device never actually goes offline.  

 

Again, if the iot hub is sending a device connected event, shouldn't have a devicedisconnected event being sent first (or at least being sent)….this doesn't happen, I only see many device connected events come through the queue.  I would expect a 1:1 ratio of connected/disconnected events.  I also wouldn't expect multiple connected events to come through on devices that are connected....seems like an IOT hub bug to me?

This is not a bug and the two options I shared with you are the supported ones.

 

The connect\disconnect events will sometimes show in a reverse order (mainly when the connect\disconnect of a device takes less than 3 seconds) and 1:1 ratios are not guaranteed when using connectionState field. There is not much we can do about this because of the mechanics here (device may reconnect to a different gateway in which case the disconnect and connect events may race). One option is to check the timestamp of the connectionState messages coming in the IoTHub but unfortunately even that is not super deterministic because of potential clock skew between the gateway VMs.

 

Thanks.

 

 

I don't care much about disconnects of less than a few minutes, I'm more concerted about disconnects that don't come back and get connected (ie 5 minutes or longer).  Then I know there is some kind of issue.  I guess I will just subscribe to a disconnect event and notify on that and not worry about connection events

A way you could implement a check on device connection would be to trigger a Function on the Device disconnected event that would invoke a direct method on the device with a timeout of 5 minutes. If the device comes back online in the 5 minutes, then the method will get answered. If not, you'll know the device has been disconnected for 5 minutes and can trigger the alert.

The problem with that is it will send many messages (ie running every 5 minutes, 24x7) and would not be very cost effective from a hosting standpoint.

Actually it won't. The function is only triggered once when IoT Hub gets the Device disconnected event, tries the method call with 5 minutes timeout and if noone answers it sends a notification out that the device is disconnected. The Function doesn't loop.

ahh....yes, was thinking I would have to ping all devices every 5 minutes, that makes a lot more sense.

 

Thank you for the suggestion...it seems to be the easiest and cleanest way to do that!

@asergaz Hi, this is out of the subject, but how could we change the connection state programmatically. Also status of device.

@Nagnath hi. You can use the azure iot sdk and give a look at how to manipulate device twin here.

 

Open a new post if you have further questions. Thanks :)!

Hi guys if you allow-me let revive this thread...
@asergaz if there is no guaranty to use connectionState which would be the field where I can have that guarantee otherwise what is the sense of having a Disconnected event if that event is
never raised?
I am having the same issue mentioned by @smart_door a lot of message with state Connected and no one showing device Disconnected.

Thanks a lot for the help!

@ederfdias you do receive the notifications, only they come often, and aren't very reliable.  Ie, for my system, I wanted to notify the user when the device was offline.  If I just used that notification, it would send tons of notifications every time it disconnected for any amount of time (or when it thought it did).

 

What I ended up implementing was a function that has a queue trigger on the hubconnectionqueue.  When it triggers, it checks to see if the message was "deviceconnected", if not (device is disconnected), then it tries to get a status of the device (command on the device, like a ping), if it doesn't respond in 5 minutes, it will then send out a device disconnected message to the user.  This prevents sending many false readings, so the device has to be offline for at least 5 minutes before it will send an email.

If the device is offline, then it sends the notification, and adds it to an offline list.

 

If the message is a deviceConnected message, then I check to see if that device is in the disconnected device list (add from above), and if it is, I remove it from the list and send out a device back online message.  This way only devices that I've notified the user that it was disconnected will be sent a connected message.

 

hope that makes sense, seems to do the job for me as the other way was just not reliable at all.

1 best response

Accepted Solutions
best response confirmed by OlivierBloch (Microsoft)
Solution

Hi!

You should have a look here: Device heartbeat

 

"

The IoT Hub identity registry contains a field called connectionState. Only use the connectionState field during development and debugging. IoT solutions should not query the field at run time. For example, do not query the connectionState field to check if a device is connected before you send a cloud-to-device message or an SMS. We recommend subscribing to the device disconnected event on Event Grid to get alerts and monitor the device connection state. Use this tutorial to learn how to integrate Device Connected and Device Disconnected events from IoT Hub in your IoT solution.

"

 

Hope that helps!

Thanks

View solution in original post