SOLVED

Cluster issues with Exchange 2016

Copper Contributor

Hi,

 

We recently rebuilt one of our Exchange servers, and have come across an issue with the Windows Failover Clustering, rather than the Exchange side of things.  Once the server had been rebuilt, we added that note back into the DAG via the Exchange console.  We then proceeded to re-seed the passive database copies.  All of that worked okay, but we get failures when we test the replication health.

 

It looks like the process of adding the clustering service, but without being told it was waiting for a server restart to complete, which we didn't do.  I suspect that is the reason why in the Windows Failover Clustering, it only shows a single node.  When I attempt to add the newly built node to that cluster, it fails stating that the node is already part of the cluster.

 

Running the following command shows:

 

cluster /cluster:DAG02 /add /node:SERVER1

 

Configuring node SERVER1
---------------------------------------
12% Validating cluster state on node SERVER1.This phase encountered an error for Cluster object 'Node SERVER1 appears to be a member of a cluster. It is either a member of an existing cluster or the node was not cleaned up after being evicted from a cluster. If you are sure this is not a member of a cluster run the Remove-ClusterNode cmdlet with the -Force parameter to clean up the cluster information from the node and then try to add it to the cluster again.' but will continue. The error status is 5065 (0x000013C9).
This phase has failed for Cluster object 'SERVER1' with an error status of 5065 (0x000013C9).
This phase has failed for Cluster object 'SERVER1' with an error status of 5065 (0x000013C9).
Cleaning up SERVER1.

System error 5065 has occurred (0x000013c9).
The cluster node is already a member of the cluster.

 

cluster node
Listing status for all available nodes:

Node Node ID Status
-------------- ------- ---------------------
SERVER2 2 Up

 

Checking the database copy status on SERVER1:

 

Get-MailboxDatabaseCopyStatus -Server SERVER1

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------
EDB AC 01\SERVER1 Healthy 0 0 16/03/2021 09:50:05 Healthy
EDB DG 01\SERVER1 Healthy 0 0 16/03/2021 09:50:21 Healthy
EDB HJ 01\SERVER1 Healthy 0 0 16/03/2021 09:49:47 Healthy
EDB KM 01\SERVER1 Healthy 0 0 16/03/2021 09:49:11 Healthy
EDB NR 01\SERVER1 Healthy 0 0 16/03/2021 09:47:09 Healthy
EDB SZ 01\SERVER1 Healthy 0 0 16/03/2021 09:49:48 Healthy

 

And on SERVER2:

 

Get-MailboxDatabaseCopyStatus -Server SERVER2

Name Status CopyQueue ReplayQueue LastInspectedLogTime ContentIndex
Length Length State
---- ------ --------- ----------- -------------------- ------------
EDB DG 01\SERVER2 Mounted 0 0 Healthy
EDB AC 01\SERVER2 Mounted 0 0 Healthy
EDB HJ 01\SERVER2 Mounted 0 0 Healthy
EDB KM 01\SERVER2 Mounted 0 0 Healthy
EDB NR 01\SERVER2 Mounted 0 0 Healthy
EDB SZ 01\SERVER2 Mounted 0 0 Healthy

 

I'm not sure how to proceed here.

 

I don't know whether it would be safe to run the suggested command, "Remove-ClusterNode SERVER1 -force" to cleanup the metadata, then attempt to re-join it to to failover cluster, without upsetting anything else on the Exchange side.

1 Reply
best response confirmed by HowardGyton (Copper Contributor)
Solution
Answer was that the for whatever reason, it failed to automatically add the node back in to the Failover Cluster. When I tried to manually add it to the Failover Cluster, and it stated the node was already a part of a node, what that really meant was I had manually changed the service from Disabled to Automatic in the process of trying to find why it wasn't working.

Switching the service back to Disabled, then attempting to add the node to the cluster fixed the issue.

Now both DAG, and Failover Cluster are reporting as Healthy.
1 best response

Accepted Solutions
best response confirmed by HowardGyton (Copper Contributor)
Solution
Answer was that the for whatever reason, it failed to automatically add the node back in to the Failover Cluster. When I tried to manually add it to the Failover Cluster, and it stated the node was already a part of a node, what that really meant was I had manually changed the service from Disabled to Automatic in the process of trying to find why it wasn't working.

Switching the service back to Disabled, then attempting to add the node to the cluster fixed the issue.

Now both DAG, and Failover Cluster are reporting as Healthy.

View solution in original post