Forum Discussion
What happens to Cluster shared volumes during network outage?
Mark_Albin Thank you for your response and confirming my thoughts. In the last line you mention a redundant FC network/fabric. How would this help without having the extra ethernet network? Since even a second FC fabric would rely on the network communication to be available, correct?
Currently the CSV volumes have 4 paths to each host.
Gabrie - we experienced the same issue last week when a network ARP event took one of my HyperV clusters down. We had data corruption as the CSV control information was not able to communicate over the Ethernet. The fiber channel was fine, and we do have redundant FC (Alpha/Omega) networks.
Did you ever deploy the out-of-band set of switches in your environment? If so, which networks are you routing across Management (538) , Live Migration(3807), or Cluster Network (3808)? Can't be the Cluster Network - would one of the others even prevent the underlying issue?
Any advice you can provide would be very much appreciated.
Nancy Freeman
- TaysolJan 18, 2024Copper Contributor
This post nearly a year old. I suggest you create a new discussion in order to increase visibility.
I'm going to make the following assumptions; Your use of Alpha/Omega is what is usually classified as A/B Fabric, the Management (538), Live Migration(3807), and Cluster Network (3808) networks you mention are vlans that you have defined within your environment.
I do not see a Cluster Storage Network in your list. You may be depending upon cluster storage's failover behavior for this. The Cluster Storage network will use the Cluster Network if the Cluster Storage Network is unavailable.
In my humble opinion, Cluster Storage in Hyper-V using Fibre Channel is not ready for prime time. Essentially, it is a non-clustering filesystem that achieves clustering by coordinating writes through an out of band communication channel (the Cluster Storage Network.) This leaves you with problems if EITHER the FC or IP network has a brief outage. If your storage were over IP, then this would make sense. It does not make sense for Fibre Channel. An actual clustering filesystem will have a mechanism for coordination over the same pipe as the storage.
If you are using Hyper-V to support enterprise workloads or systems that are intolerant of storage fragility, do not use CSVs. You still have the option of using traditional storage or SMBv3.
In the environments I support, I place systems that do not require live migration (such as Domain Controllers, clustered app servers, clustered db servers, etc) on traditional storage and the have the rest use SMBv3 from a SAN/NAS.
You can certainly take steps to mitigate these types of issues by using a completely separate network using separate physical nics, but careful architecture is required to make sure you do not simply move the problem from one place to another. A short outage on your isolated network will have same impact as a short outage on your non-isolated network. My recommendation would be to use both.