Forum Discussion
HCI Stack 23H2 WMI - RPC Server is Unavailable
After updating to Solution 10.2405.2.7 I am unable to manage my cluster through the failover manager and unable to run cluster validations due to the error RPC Server is unavailable. I have restarted the nodes, confirmed firewall rules are allowing WMI and RPC, and confirmed DNS. Azure and WAC are reporting no issues with the cluster, I am worried about applying the next solution in case this WMI issue will cause issues with the updates.
Any guidance to resolve this fault would be much appreciated
- FrankKeunenCopper Contributor
We are encountering the same issue with a two-node Azure Stack HCI cluster (Dell MC-760) of one of our customers (same version). We are collaborating with Dell and Microsoft to resolve it.
"
That error is straight out of COM code inspection of firewall behavior. Firewall (or something else?) is preventing the automatically generated DCOM code in our service from opening a TCP port. It's not even reaching our custom code, it's in DCOM machinery.
Let me set expectation that this may take time during the investigation as it becomes a bit complicated. There will be some modification into the RPCSS process and iDNA may require with the network stack debugging (it depend how far we will go through this). Therefore, it wont be straight forward solution.
"
Once I have an update, I will provide you with the details.
- DanielF1395Copper ContributorI was going to update today, thank you for the reminder.
Please find the fix below from MS Support
Found that there has been a known issue since the latest update, a rule seems to be causing the issue. The workaround is to run the following on all nodes
Disable-NetFirewallRule AzsHci-ImdsAttestation-Block-TCP-In- FrankKeunenCopper Contributor
That's correct — we identified the same fix during a troubleshooting session last night. We enabled Drift Control, which allowed us to disable the Windows Firewall on one of the nodes. Once the firewall was disabled, we were able to perform a CUA scan on the affected node.
We also compared the firewall ruleset with a working cluster running an older version of Azure Stack HCI and disabled the new rules that were added with the latest release from Microsoft. We concluded that the issue was - indeed - caused by the Firewall rule "AzsHci-ImdsAttestation-Block-TCP-In".