We have been working with a hybrid exchange setup EX2013/365 for quite some time, working without problems. Only components remaining on EX2013 are our public folders and the odd few mailboxes we've not needed to migrate to 365.
We have 2 EX2013 servers in a DAG, one hosts has the PF MBX database mounted whilst the second server has the rest of the MBX databases.
Since yesterday however, we intermittently lose connection to the on-prem boxes via Outlook.
If you try to go into public folders for example it will just hang.
In the outlook connection status pane if you hit reconnect, 365 components connect back fine but the public folders stay stuck on "Disconnecting". There's 2 GUIDs for public folders (one being the root public folder, can't check what the second guid is) in this pane and in some instances one would be disconnecting and other would be stuck connecting (after hitting reconnect).
In the Inetpub logs we noticed a few 401 errors for autodiscover but no other clear indicator as to what is going on in the event logs.
The more frustrating point is when you close outlook and re-open it the on-prem mailboxes are accessible for a short while before hanging some point in the future.
Recycled the app pools via IIS on the server hosting the PFs before rebooting both servers.
To further add to this pain of not knowing what the hell is going on, now have 77 health mailboxes, ran Get-ServerHealth and a few of the services are reporting as unhealthy including :
ActiveSync, Autodiscover.Protocol, Autodiscover, ActiveSync.protocolm EWS.Proxy, OutlookMapiHttp.proxy, oab.proxy, EWS.
When the crash occurs we see audit logs generate on the exchange servers stating that that the users have logged off.
We have not made any networking changes/no updates/no certificate expiries, literally have changed nothing infrastructure wise to have caused this behaviour.
Any suggestions of what else to check or what could be the issue?