Hey All,
We are on Exchange 2019 CU12 with the Aug 2022 SU and Extended Protection enabled with script version 22.09.20.1343. We are experiencing a ton of 4002, and 24001 errors as well. We also had an incident today where users with Outlook on one of the servers began experiencing sluggish responses. In working to get everyone back up ASAP, I activated all databases on other servers, went into maintenance mode and rebooted, and then came out of maintenance mode.. Then performed the reverse operation on the second server so as to reboot it as well. Situation is ok for now but there are still tons of 4002 and 24001 event id.
Upon further troubleshooting with Health Sets and Probes, are getting pretty consistent reports of the Health Sets Outlook and Outlook.Protocol being in a degraded stated with the Probe "OutlookRCPCtpMonitor' for Health Set Outlook in a degraded state. For Outlook.Protocol, the famous probes of OutlookRpcDeepTestMonitor and OutlookRpcSelfTestMonitor being in an 'unhealthy' state as well. Other health sets that go in and out of Healthy state are OutlookMapiHttp and Compliance.
I know now if we experience another incident I will recycle the 'MSExchangeRpcProxyAppPool' and 'MSExchangeRpcProxyFrontEndAppPool' first as I believe this would have cleared up our sluggishness.
Any direction on where to go from here will be helpful. If we do experience another incident, I will more than likely roll back the Extended Protection config with the script and will report back to all.
Appreciate any help or direction in advance.
Kind Regards,
Alex Levin
October 5th update.
So roughly one week to the date, once of our Exchange servers started responding sluggishly again, same symptoms. This time we were able to gather some more data before remediating the server. We noticed two IIS Worker Processes (w3wp.exe) running at 25GB and 26GB apiece respectively. We not only recycled both rpc app pools, but after that failed, in order, i then recycled all app pools. No change, performed an IISReset, no change. Then restarted the MSExchange RPC service and no change. I was able to grab a dump of one of the w3wp.exe processes misbehaving. At this point it appears to act like a memory leak with these processes. We are working to analyze the dump file. I have opened a ticket with Microsoft to try and get more eyes on this. Will update this box as we move along.
Regards,
Alex Levin