Forum Discussion
What happens to Cluster shared volumes during network outage?
Based on your description, it seems that the network outage caused the Hyper-V hosts to lose connectivity to the CSV volumes, which resulted in the VMs being stopped and the cluster services being interrupted. This is because the CSV volumes are a shared storage resource that relies on the network for communication between the hosts and the storage.
Your understanding of the SameSubnetDelay and SameSubnetThreshold settings is correct. These settings are used by the cluster to determine how long to wait before taking action when a host becomes unavailable within the same subnet. In the case of a complete network blackout, these settings do not apply, and the remaining hosts will take action immediately.
Regarding the CSV volumes, it's possible that the volumes that were not lost were owned by the hosts that were still up and running. When a host loses connectivity to a CSV volume, it will drop the volume, and the ownership of the volume will be transferred to another host in the cluster. If a host is the owner of a CSV volume and it loses network connectivity, the ownership will not be transferred to another host until the SameSubnetThreshold has been exceeded.
In terms of preventing this from happening again during network maintenance, creating a separate network for heartbeat communication is a good idea. This will provide an additional layer of redundancy and help ensure that the cluster can remain online during a network outage. Additionally, you may want to consider implementing a redundant storage network, such as a separate FC fabric, to provide additional resilience for the CSV volumes.
Mark_Albin Thank you for your response and confirming my thoughts. In the last line you mention a redundant FC network/fabric. How would this help without having the extra ethernet network? Since even a second FC fabric would rely on the network communication to be available, correct?
Currently the CSV volumes have 4 paths to each host.