CCR, Site Resilience and sample decision making processes...

Published Oct 08 2007 12:44 PM 9,325 Views

The ability to continue to provide a full service to your user community in the event of the loss of a data center is an increasingly common requirement. The use of Cluster Continuous Replication (CCR) with Exchange 2007 is an obvious choice in providing some of the functionality to meet this requirement. One advantage to this approach is that the reliance on expensive storage replication solutions might be reduced. In addition a disaster recovery scenario is managed from within one team rather than several. In most cases the messaging team can manage the restoration of service without the intervention from the storage team, or from a remote 3rd party hardware vendor for example. The use of Exchange Server data replication as opposed to storage replication solutions also gives us more options to use PowerShell scripts to assist administrators in simplifying and controlling service and data recovery.

An example Exchange 2007 design (that flowcharts are based on) using CCR is as follows.

Please click on thumbnails below to view the charts in their full size:

With any design it is important to understand the processes and decision making that might be involved when certain scenarios present themselves. If we are designing for high availability administrators need to understand what decisions might need to be made and the processes that would be required should a particular set of circumstances occur. For example, what should the recovery strategy be in the event of the loss of a single mailbox database? Should the Exchange cluster group be moved to the passive node at this stage? If so this would mean the temporary loss of service to all users on this server for the sake of those on one mailbox store.

The following flowcharts show the likely processes and decision making flow that might be involved in certain disaster recovery situations based on the above Exchange 2007 design.

Total Site Failure - Likely steps & decision making process in recovering from total physical site failure

Single Server Failure - Likely steps & decision making process in recovering from single server failure

Single Database Failure - Likely steps & decision making process in recovering from single active database failure

Of course with the introduction of Standby Continuous Replication (SCR) in Service Pack 1, CCR or LCR within the same physical site and SCR to the remote location might be a preferred solution. Even so it will still be important for administrators to understand what their recovery paths might be and what decisions will need to be made to properly control and expedite service and data recovery.

I wanted to thank Matt Richoux and Scott Schnoll for their reviews of this!

UPDATE: Per your request, we have made those available as a download for printing here.

- Doug Gowans

Not applicable
Looks awesome!

Would it be possible to add a hi-res printready XPS/PDF?
Not applicable
Lukas Beeler,

Done! I edited the post and pointed to the location in our Files section.
Not applicable
Wow, awesome! Thanks for posting this.
Not applicable
Speaking of CCR and SCR, I'm experiencing issues with enabling SCR in a clustered environment (CCR and SCC).
i can only seem to make it work within a stand alone to stand alone mailbox server senario.

is this an issue with the current beta release of SP1?

anyone else get SCR to work in a clustered evniorment?
thanks alot!
Not applicable
Thanks for a good post
Not applicable
Hi TomKern,

It does work.  What happens when you try?  Are the operating systems for the SCR source and the SCR target the same?  Are you getting an error message?  If so, what is it?
Not applicable

I'm a similar problem...

I have two SCCs running W2K3 SP2, and I have the following "errors."

On the source, the storage groups show "NotConfigured" when I run a get-storagegroupcopystatus.  This happens even though I have enabled the SG and I verify that the data is in AD by checking the value on the msExchESEParamSystemPath in ADSIEdit.

On the target node nothing is ever created, and when I try to run the Update-StorageGroupCopy, as it states in the documentation, I get the following error:

failed to find the source node <SOURCE> through RPC.

Not applicable
Scott, the config I have is a SCC source and a single mailbox server as a target.
all at win2k3 sp2.

the errors i get are-when i mount the db at the target, i get "log*.tmp not found".
the db is always in a dirty shutdown even after running eseutil /r
the exchange replication service stops running.

stuuf like that
Not applicable
Note to the casual reader...

CCR cannot span sites with a Windows 2003 OS _UNLESS_ they are in the same AD site (ie. a bridged network, not a routed one).  If you read the first Exchange 2007 design (and only that one) you will think it is possible.

Windows 2008 server will allow for a single AD site to span multiple subnets which will finally allow you to have a single CCR setup in two routed networks.

So, until then if you need data center recoverability make sure you read this very carefully and test your solution well before putting it into production.
Not applicable
Brian (Kronberg),

>>Windows 2008 server will allow for a single AD site to span multiple
>>subnets which will finally allow you to have a single CCR setup in two
>>routed networks.

What you're referring to is the limitation of Windows Server 2003 clusters to have all nodes on the same IP subnet, which requires stretching subnets to a remote location.

Windows Server 2008 allows cluster nodes to be in different IP subnets, making stretching the subnet unnecessary. However, Exchange Server 2007 (RTM and SP1) requires that all nodes in a cluster be in the same AD Site. 2 cluster nodes can't be in different Sites - both subnets need to be associated with the same AD Site.

Not applicable
Download link to the files does not work
Not applicable

Do you get an error of some sort? I have just tried and it worked... it takes a little but then the XPS files open in the browser window.
Version history
Last update:
‎Jul 01 2019 03:31 PM
Updated by: