Deploying Windows Server 2025 Clusters with Edge Networking Solutions Part 2: How Network HUD ensures optimal, healthy and smooth Networking operations and health.
Welcome to the second blog in our Networking Deployment Series for Windows Server 2025. In this deployment series, we take a look at Contoso Medical Center’s journey deploying and harnessing the power of Windows Server 2025 Software Defined Datacenter (SDDC) to build a next-generation environment for your VMs and applications. In the first blog, we deployed Contoso Medical Center’s host-networking using the uniform, automated and scalable solution offered by Network ATC. With host networking already deployed using Network ATC at Contoso Medical Center, the next challenge is ensuring everything runs as intended- day in and day out. This is where Network HUD comes in, providing real-time visibility and proactive diagnostics to keep the network healthy, aligned, and optimized.
Note: Network HUD is currently in private preview on Windows Server 2025. When it goes to Public Preview and General Availability (GA), users will need their WS machines Arc enabled. Along with that, they will need to either (1) attest to Software Assurance or (2) have a Pay-as-you-go subscription to successfully enable and manage Network HUD.
With your host-networking now deployed, Network HUD steps in to intercept any Networking health, diagnostic and operational issues. Network HUD proactively identifies and remediates operational networking issues on your Windows Server 2025 cluster. Running and maintaining a network for your business applications is a hard job. Ensuring a workload is stable and optimized requires coordination across the physical network (switch, cabling, NIC), host operating system (e.g., virtual switch, virtual NICs, etc.), and of course the application that runs inside the VMs or Containers. Each of those have their own configurations, have different capabilities, and may be managed by different teams. Even if you’ve perfectly implemented your “golden configuration”, your environment may still experience the ripple effect of a bad configuration from another part of your network that degrades your application performance.
Installing Network HUD through an Azure Arc extension is fast, easy, and efficient. With just a few clicks in the Azure Portal, you can enable powerful network health monitoring across your cluster—no manual setup, complex scripts, or extra tools required. It integrates seamlessly into your environment, letting you start monitoring in minutes.
(click on the image for a closer look)
Network HUD is cluster aware. Network HUD understands how you intend to use your adapters and as a result can manage the stability across the cluster. Imagine Node1 in your cluster has an unstable adapter. Without informing the other nodes of the issue, the healthy nodes could overwhelm Node1 and cause a larger issue (e.g., cluster crashes or Storage Spaces Direct rebuilds).
To address this, Network HUD works in tandem with Network ATC. When Network HUD identifies instability on one node, it informs Network ATC which can manage the cluster-wide configuration and ensure that the healthy nodes do not overload the degraded nodes.
Network HUD integrates with the physical network. Network HUD takes advantage of capabilities in the physical switch to ensure that your configuration matches what’s on the physical network. For example, we can determine whether the locally connected switchports have the correct data center bridging configuration required for RDMA storage traffic to function (and as previously mentioned, we know which switchports to look at because the adapters are part of a Network ATC storage intent).
To ensure Network HUD can validate the physical network, make sure the switches connected to your cluster nodes are supported with the necessary capabilities: https://learn.microsoft.com/en-us/azure-stack/hci/concepts/physical-network-requirements?tabs=overview%2C23H2reqs.
Network HUD can handle multiple scenarios, like the ones mentioned in the examples above. Here’s a brief description of each Network HUD scenario in a few lines:
- Failed Network ATC Intent
Ensures that all Network ATC intents are successfully provisioned. Flags a health fault if an intent fails, which can lead to incomplete set-ups or inaccurate configurations and unwanted system drift.
- Driver Consistency, Age and Stability
Network HUD checks that all network cards (called NICs – short for Network Interface Cards) in your servers are using the same version of their software, known as drivers. These drivers help the NICs talk to Windows and send data over the network. If the drivers are too old or mismatched, it can cause slow performance or connection problems—so Network HUD warns you if a driver is over 2 years old and flags a problem if it’s older than 3.
- LLDP Operation Status
Validates whether LLDP is running properly, which is critical for detecting fabric misconfigurations. A non-operational LLDP service prevents other HUD scenarios from functioning accurately.
- Misconfigured VLANs
Detects inconsistencies in VLAN advertisements across NICs, switches, and hosts. Ensures that VLANs required for management, compute, and storage traffic are consistently available and properly configured.
Let’s say you have VLAN 710 configured on your VMs —everything looks good on the host side. But if the switch connected to one of your nodes isn’t actually advertising VLAN 710 on the right port, your workloads could suddenly lose connectivity. This kind of mismatch is easy to miss manually, but Network HUD catches it instantly. It reads the switch information using LLDP packets, compares it with what’s expected from Network ATC on the host, and alerts you the moment something doesn’t line up—so you can fix it before your users ever notice.
- Inconsistent PFC Configuration
Checks Priority Flow Control (PFC) settings between the host and top-of-rack switches. Flags mismatches or missing PFC priorities that could lead to traffic congestion, packet loss, or storage crashes.
Here’s a demonstration of how the end-to-end installation, enablement and health fault alerting for Network HUD looks like on a standard Windows Server 2025 cluster: https://youtu.be/hW47R5Knu2k.
We are very keen to receive customer feedback on Network HUD and all its scenarios. To try out Network HUD on your Windows Server 2025 Cluster, and participate in our private preview, please reach out to: edgenetfeedback@microsoft.com. If you have any additional operational or diagnostic scenarios that you think Network HUD can alert you to, please reach out to us and let us know!