I have successfully upgraded a large environment to 2010 SP1, then RU1 and finally RU2. The latest change was change was the update to FPE on HT & Edge servers. All updates went just fine following ALL known issues and their corresponding procedures.
Now, MB Roles does not have FPE running so no changes were made there since RU2. Recently we saw a crash in the cluster service following a MAPI Error on one MB Server:
MB3 reports a MAPI error on MB4, then 24 hours (+- 10 minutes) later MB3's cluster serice terminates with no good explanation. MB3 and MB4 shares/hosts a DAG group with 12 active, 12 passive DB's each. The result was that all active DB;s was activated on MB4, and the DAG copies are running/working fine on MB3. That is somewhat confusing; The Cluster service terminated, move all active DB's to another server, but retain the DB copy function with no complaints....
Right now, I consider this environment to be in an "unknown state".....
Comments?