Forum Discussion
AviGrinberg
Mar 16, 2024Copper Contributor
SQL server 2019 alwayson problem
Hi, I've SQL server 2019 STD always on set up with 2 nodes and file share witness (located on third server). yesterday we had an network issue that the main switch was down for 15 minutes. after the...
SivertSolem
Apr 03, 2024Iron Contributor
We have experienced similar faults when the network is unstable.
From what I understand, the failover cluster services on both nodes end up in a state where they believe their copy of the cluster configuration is corrupt, leading to a failure state where the servers are both participating in a failover cluster and not. In this scenario no node can host the core cluster resources and everything's broken.
Once a critical failure of the cluster service has occurred, the availability group configuration can and will be dropped from SQL Server as well.
I have not had any luck attempting to recover the CLUSDB, and have in these cases resorted to reinstalling failover clustering services and reconfiguring the entire cluster and availability groups.
I believe part of the problem in this cluster setup is that the file share witness does not keep a copy of the cluster configuration. All it does is vote on who should be the primary.
In any case it's technically not a SQL Server issue, but a failover clustering services issue.
I have been wondering if replacing the file share witness with a third cluster node (without SQL server) would make the cluster configuration more robust.
From what I understand, the failover cluster services on both nodes end up in a state where they believe their copy of the cluster configuration is corrupt, leading to a failure state where the servers are both participating in a failover cluster and not. In this scenario no node can host the core cluster resources and everything's broken.
Once a critical failure of the cluster service has occurred, the availability group configuration can and will be dropped from SQL Server as well.
I have not had any luck attempting to recover the CLUSDB, and have in these cases resorted to reinstalling failover clustering services and reconfiguring the entire cluster and availability groups.
I believe part of the problem in this cluster setup is that the file share witness does not keep a copy of the cluster configuration. All it does is vote on who should be the primary.
In any case it's technically not a SQL Server issue, but a failover clustering services issue.
I have been wondering if replacing the file share witness with a third cluster node (without SQL server) would make the cluster configuration more robust.