Exchange 2016 - Multiple Issues (IIS Sessions, WinRM PowerShell, Slow Performance)

Copper Contributor

We recently migrated from Exchange 2010 to Exchange 2016 and are experiencing issues with Mobile Devices, Outlook messages in the outbox and overall slowness with the systems. I've followed the best practice guides and have sized (over sized) the servers accordingly. I have 3 tickets with MS support for over 6 months in which there is little progress being made. They keep doing the whole shell game by passing me along to other teams. Our account rep has been notified but here I am... Most of our equipment is older or outdated as we are in the middle of a refresh cycle; I'm well aware of this. 


Environment:

  • VMware 5.5
    • Host have 2 sockets, 8 cores with 32 logical cores 
    • Host have 132Gbs of RAM
  • 6x Windows Server 2016 with Exchange 2016 CU19
    • OS and Exchange are up to date
    • 2 sockets, 20 cores
    • 32Gbs of RAM
    • Paging file set to 32778MB
    • Dedicated Drives for Exchange Logs and Databases
    • All in same site on same subnet
  • 3x Server 2012R2 Domain controllers in this site
    • 2 sockets, 4 cores 8 logical processors
    • 8Gbs of RAM
  • roughly 8000 mailboxes but only 4500 warm bodies (people)
  •  Single DAG, 6x databases each with 2 replica
    • DB1 replicates to 2 & 3, 2 replicates to 3 & 4 etc. 
  • RSA MFA Enabled on OWA and ECP for select accounts
  • Citrix NetScaler Load Balancer
    • Idle sessions disconnect/timeout/age off after 30 minutes
    • All sessions have a maximum lifespan of 59 minutes. 
  • Outlook configured in cache mode, as per GPO

 

Issues/Symptoms:

  1. Overall sluggish operations while RDP to server.
    1. Avg CPU utilization  <30%
    2. Avg Memory utilization <=85%
    3. Windows Explorer, EMS, Start menu all take longer than expected to load. 
  2. WinRM Error messages randomly appears while using EMS (attachment 1)
    1. This can happen immediately after launch EMS or 2 minutes after executing commands successfully.
    2. If I attempt to reconnect the EMS shell via 'connect-exchangeserver' command, the session connects. 
  3. Reports of messages getting stuck in users outbox
  4. Reports of iPhones not being able to send mail (cannot connect to server) or blanking out and reloading the entire mailbox. This occurs randomly and then will automatically go back to working.
  5. Reports of iPhones not being able to download messages for short durations. This also automatically fixes itself after some time. 

 

Observations: 

  1. No reported performance issues from a hardware or OS level
    1. Daily performance reports show no sighs of Disk Latency, CPU bottlenecking or memory issues. 
    2. MS Support as also stated that these servers are "over spec" and should not be experiencing the reported issues. 
  2. Not latency issues on Storage. 
    1. FC IBX XIV SAN
  3. No signs of overallocation on VMWare Host
  4. Long running sessions in IIS for ActiveSync Devices >5 days or when services were last recycled. 

 

Attachments: 

  1. HealthCheck--20210817092944.docx
    1. latest MS health check script output - sanitized
  2. Screenshot_1.png
    1. example of the WinRM Error message.  

 

I personally feel that it has something to do with the configuration of our network load balancer (Citrix) since those ActiveSync sessions are not timing out or aging off as they should. IIS is configured as per default in which idle session age out after a few minutes. This does not appear to be the observed nature in our environment. According to our network team, the load balancer is configured correctly; I do not have direct access to these devices. 

 

Any and all help would be greatly appreciated as MS has not been helpful thus far.

0 Replies