Hi Nobody, Rod, Russ, and Michael Hysen,
Thanks very much for your additional comments and feedback.
I understand the points being raised, and you are correct that I did not address the separation of administration
scenario that you describe. In my experience (and in the experience of many others whom I've asked about this), the separation of administration between the cluster and Exchange is very rare.
It sounds like your experience differs from mine. For customers I've worked with that run Exchange clusters, the clustered mailbox servers and the cluster itself is managed by the messaging team. Often there are cluster experts on staff both inside and
outside of the messaging team (for example, on a Windows management team, or some other management/ops team), but these admins typically don't perform tasks such as moving clustered mailbox servers, or taking databases offline and doing maintenance. Of course,
there will always be exceptions, and I appreciate your pointing them out.
From a management perspective, I would strongly argue though that an administrator who is responsible solely
for the cluster and not for the clustered application should not be performing management tasks related to the clustered application. In other words, someone whose job it is to manage the only cluster should not be moving clustered mailbox servers. We did
do a tremendous amount of work to improve the Exchange cluster experience, which was driven in large part by customer feedback. But, having functionality that allows a pure cluster admin without any Exchange permissions to fully manage a clustered mailbox
server in Exchange 2007 or Exchange 2007 SP1 was not a goal of ours.
If any readers believe this separation of cluster versus clustered application administration is an important
scenario, I would like to hear more details about it, and I invite you to contact me offline to discuss this further. I'm particularly interested in where each line of separation is drawn, and why a non-Exchange administrator would be allowed to perform tasks
that directly affect Exchange (such as causing a brief interruption in service by moving a clustered mailbox server between nodes). Obviously it is too late to change anything in SP1, but certainly your feedback is something to think about for future releases
of Exchange Server.
Nobody, you wrote that "I really liked being able to do everything from CluAdmin. Without "Bad things Happening"...".
Let me reiterate that using CluAdmin to manage a CMS does not make bad things happen. The "bad things" I wrote about only apply to CCR environments; they do not apply to SCC. The "bad things" are not corruption, damage will not happen, and
nothing bad is happening as a result of using the cluster tools.
The "bad things" I wrote about are bad things that have already happened. They are
not the result of using CluAdmin.
The reason we recommending using our tasks instead of the cluster tools is because they have logic built-in
that does some extra checks before the handoff is done. If the tools detect that the passive is not in a good state, and as a result, databases won't be mountable, it will block the move.
The Exchange Management tools provide you with an additional level of protection. The cluster tools do not.
It was never our intent to put this or any other intelligence into the cluster tools; rather it was just the opposite. Our intention was to get folks out of non-Exchange tools and into Exchange tools. And when you do use the Exchange
tools, be aware that they are using the same Cluster API and making the same API calls that the cluster tools use. In other words, our tool is an Exchange management layer on top of the cluster tools/API.
Rod, you wrote that "Thankfully SQL and other Microsoft produts (sic) don't have an issue with CluAdmin/Cluster.exe.
Nor do they have any other way to manage them." I disagree with that, and so do the SQL Server folks I asked about this.
SQL Server and many other cluster-aware applications include application-specific tools for management purposes.
For example, you don't create a SQL Server database using the cluster tools; you create one with the SQL tools.
You don't configure SQL Server log shipping with the cluster tools; you use the SQL tools. Can you move a clustered SQL Server from one node to another using the cluster tools?
Yes! Can you do that with an Exchange 2007 clustered mailbox server?
Yes! Will it cause any harm to Exchange if you do?
No! Could it result in downtime because of external factors?
Yes! And this is true of nearly all clustered applications.
When it comes to management tasks related to the cluster, such as removing a node from a SQL Server cluster, SQL
Server documentation provides instruction on using the SQL tools to do this (http://msdn2.microsoft.com/en-us/library/ms191545.aspx).
They don't provide instructions to do this using cluster tools. For some tasks, such as changing the IP address of a clustered SQL Server, they provide instructions using the cluster tools (http://msdn2.microsoft.com/en-us/library/ms190460.aspx),
and not the SQL tools.
The cluster doesn't generally know anything about the health and state of an application running in the cluster.
For example, take SQL Server mirroring in a cluster. Mirroring and clustering work independently of each other. Mirroring knows nothing about clusters, and clusters know nothing about mirroring, just like clusters don't know anything about continuous replication.
Russ, you wrote that you "can't delegate the ability to move the cluster to an operations team without giving
them other permissions that are not required for their job." I agree that any operations team that is responsible for managing a cluster that contains Exchange should have the proper permissions.
I think we just disagree on what those permissions should be. I understand that you and others think the administrator should have only permissions to the cluster, and not have Exchange permissions.
I disagree, but if you feel strongly about this, please do contact me offline so that I can better understand these scenarios and take this feedback back to the team for consideration in future releases.
Rod and Russ, no doubt with your clustering experience it is natural to view an Exchange cluster from a cluster
perspective. What we tried to do with Exchange 2007 is abstract away the cluster as much as possible and make the experience of managing a clustered mailbox server not more like managing other clustered applications, but rather more
like managing a standalone Mailbox server. We don't think that Exchange administrators should have to be cluster experts in order to deploy and run Exchange in a cluster. In the case of CCR, I don't mind saying that I think we did a fantastic job of minimizing
the need for cluster knowledge, particularly in the area of hardware and storage configuration. Certainly some cluster experience will be helpful, as they still need to build a cluster before they can install Exchange, but we help them out there, too, by giving
them complete step-by-step instructions on how to do this (in RTM using GUI interfaces, and in the forthcoming SP1 content, using both GUI and command-line interfaces).
As we move beyond Windows Server 2003 and into Windows Server 2008, as we're doing with SP1, it is even more
important to abstract away the cluster because failover clusters have changed significantly in Windows Server 2008. In fact, there is what we call a "clean break" in the Cluster API in Windows 2008.
This means, among other things, that you can't call Cluster API's between client levels. This means that for administration purposes:
•
Windows Vista/Windows 2008 can manage only Windows 2008 (and later) failover clusters
•
Windows XP/Windows 2003 can manage only Windows 2003 and earlier failover clusters
In addition, given the substantial changes to failover clustering in Windows 2008, namely in the core areas
of security, storage, and networking, it makes even more sense to abstract away the cluster for Exchange administrators, particularly those who are clustering Exchange 2007 today. There's a lot of new and great stuff in Windows 2008 failover clusters, but
there's also a lot that has changed with respect to the management interfaces, as well. For the Exchange administrator, without the benefit of Exchange tools (including Exchange Setup) that lessen the need to be a cluster expert, a new high learning curve
would occur. They would know one way to move a resource group using Cluster Administrator and they would have to learn the new way to move the group using the Failover Cluster Management tool in Windows 2008. As a result of the way in
which we provide management tasks now, the Exchange administrator does not need to become a cluster expert in order to run Exchange 2007 SP1 on a Windows 2008 failover cluster. Instead, the Move-ClusteredMailboxServer task and the Manage Clustered Mailbox
Server Wizard GUI, will look, act, and operate the same no matter what operating system you're running. As a former, long-time messaging administrator in small, medium and very large organizations, I think this approach provides a lot of benefits for administrators.
Michael, you wrote "the fact that we now have to go to two places to perform a task that should be integrated
simply makes my suggestion to use CCR look stupid to the cluster team." You don't need to go to two places; you can manage CCR using the Exchange Management tools.
You also wrote that you get comments from other like "so Microsoft implements an application that does not support the clustering standard they have implemented?"
This comment makes no sense, as we appropriate leverage the Cluster API when we interface with and manipulate resources in a cluster.
In fact, that is one of the core points I've been trying to make. The Exchange tools fully leverage the Cluster API.
We are not doing anything inside of a cluster that is in any way different from what the cluster tools do.
Rather, the Exchange tools do extra tasks outside the cluster, such as checking on the state of replication in a CCR environment before calling the Cluster APIs to do the move.
CCR is new and different; it is not like a traditional Exchange cluster, and it is not like SCC. Thus, is
requires new approaches to management. There is a bit more going on than your traditional Exchange cluster; namely, CCR uses log shipping and it does not use shared storage. These are very important differences. One of the reasons behind the naming of SCC
is to emphasize there is a single copy of the data in the cluster, and to differentiate it from CCR, where you have two copies of the data. In a CCR environment, you have an active node that contains the production copy of the database – the one that users/clients
are accessing. You also have a passive node, which contains a copy of the production database that is maintained and kept current through the use of log shipping and log replay. This second copy, the passive copy, is only useful if it is kept up-to-date.
In an SCC, you only have one copy of each database, and that copy moves between nodes when you move the clustered
mailbox server. In a CCR environment, each node has its own copy of each database. When the clustered mailbox server is moved from the active node to the passive node, it stops using one copy of the database and starts using the other copy of the database.
If the other copy is no good (perhaps because replication is not up-to-date), then the clustered mailbox server will not be able to mount the database, and this will result in downtime.
In a CCR environment, you should never move a clustered mailbox server between nodes without knowing the state
and health of continuous replication. You need to know things are OK on the node that is about to take ownership because if they are not, then you don't want to perform the move because it would result in downtime. To obviate the need for an administrator
to have to manually check the state of replication before performing a move, we included checks in our Move-ClusteredMailboxServer task. Of course, an administrator could certainly run Get-StorageGroupCopyStatus, Get-ClusteredMailboxServerStatus, and (new
in SP1) Test-ReplicationHealth, to determine the health and status of continuous replication.
And if everything checks out as healthy and up-to-date, they can use the cluster tools to move the clustered mailbox server to another node. But as you implied, Michael, why use two tasks to do this when you can use one? This is why we have Move-ClusteredMailboxServer.
It combines the health check with the same move that the cluster tools do.
Hopefully this further clarifies the statements I made in my blog, and what we say in our documentation.
Please do feel free to contact me offline if you would like to discuss this further.
And, thank you all very much for your feedback. It is much appreciated!