For those who have been around databases for any length of time, the idea of putting a database that you care about from either a reliability or performance perspective on an (SMB – Server Message Block) file share seems like a crazy idea, but recent developments have made SMB-based file shares a very viable platform for production SQL Server databases with some very interesting advantages.
Historically, the perspective has been:
File shares are slow.
The connections to the share may be unreliable.
The storage behind the file share may be flaky.
SMB consumes large amounts of CPU if you can get it running fast enough.
Over the past few years, all of these conditions have changed, and in particular the work which has been done on the 2.2 revision to the SMB protocol has produced some stunning results.
So let’s look at these one by one:
File Shares are slow
There are two components to this one:
The raw speed of Ethernet vs Fibrechannel and the speed/efficiency of the SMB protocol.
The transport layer has seen a very significant improvement in recent years. Where at one point Ethernet was orders of magnitude slower than Fibrechannel, this is no longer the case. Current commodity Ethernet is running up to 10 gigabit with 40 gigabit being tested, and on the near horizon. This will put Ethernet on par with Fibrechannel from a bit-rate perspective, and the projections are that the two technologies will leap-frog each other from here out with neither one being a clear leader.
On the protocol front, the original SMB 1.x protocol was chatty, inefficient, and slow. Over the last couple of years, the Windows file server team, while developing the 2.2 version of the protocol, has been using SQL Server, with a TPCC workload, as one of the performance benchmarks.
The benchmark configuration was to take a fibre-attached array, connect it to a server and run TPCC.
Then add an identical server connected to the first with a single 1Gb link, and run the TPCC database on the new server with the original server functioning as a file server.
When they started, TPCC over a file server ran at ~ 25% the speed of direct-attach storage. The team discovered several performance problems in the stack, but one particular bug on the client side made a stunning difference. The current results are that TPCC running over an SMB link as described above performs at
of the speed of direct-attach. That is a stunning result, and one which is not limited to Windows file servers since the fix is on the client side.
So now, we have an SMB implementation running at speeds comparable to a fibre-attached array.
Connections to the share may be unreliable
Again, there are multiple parts to this one. One aspect of this is that the underlying networking hardware has gotten very much more reliable in recent years. Consumers and enterprises alike just wouldn’t put up with flaky network connections these days. The popularity of FCoE (Fibre Channel over Ethernet) is an indicator of how much confidence of Ethernet as a storage transport has grown.
The second aspect to this one again comes back to the work done in the 2.2 version of the SMB protocol.
With this version, SMB has a number of resiliency features built in. If a link was to momentarily drop, in the past the connection would be lost and the file handle broken. With the 2.2 version of the protocol, the link is automatically re-established and the application never sees the event other than a momentary stall in outstanding IO.
If we take the configuration a step further, the file server itself can be clustered, and now has the capability to failover a share from one file server to the other without losing handle state. To clarify, SQL Server running an active workload can have the file server hosting the database files fail over, planned or unplanned, and SQL sees only a momentary drop in IO rates.
The storage behind the file server may be flaky or unreliable.
While it is always possible to put together an unreliable server, the tools now exist to incorporate very sophisticated reliability features right in the box. Particularly with the advent of Windows 8 features recently announced, we have a pretty good toolset native in the OS. We can create pools of storage which can be dynamically expanded. Pools can be assigned a variety of RAID levels. Many of the features which were previously only available in Fibre-attached arrays are now available with direct-attached storage on a Windows File Server. When you add in the capability for failover and scale-out clustering, the reliability becomes very impressive.
SMB consumes large amounts of CPU if you can get it going fast enough.
This is actually a painful aspect of Ethernet which has hurt iSCSI as well as SMB and other protocols.
A recent transport development is the rDMA transport, which enables data to flow directly from the network wire into user space, without being copied through kernel memory buffers.
This produces a huge reduction in CPU utilization at high data rates. How much?
I’ve seen an Infiniband-based SMB connection sustaining > 3 gigaBYTES per second, while consuming around 7% CPU, using SQLIO 512K writes as a workload. We’ve seen prototype units performing at twice that rate in the lab.
Now that we’ve discussed why the factors which previously were blockers no longer are, let’s discuss some of the additional benefits:
Consider the steps required for a DBA to move a database from one server to another:
Take database offline/detach
Take database offline/detach
File request to remap LUNs
Database is attached to new server using UNC path
Meet with storage admin
Database is brought online
LUNs are unmapped from original server
LUNs are mapped to new server
LUNs get discovered and mounted on new server
Database is attached to new server
Database is brought online
Additionally, you have one set of tools for configuring storage, as opposed to separate tooling for each SAN vendor which you use.
Cost is always a concern, and with the capabilities which this platform brings to bear, we can accomplish what previously has required a much more expensive solution, for a fraction of the cost without sacrificing performance, reliability or manageability.
As one example of the whole package, one reference configuration for a SQL deployment had originally been configured with Infiniband for communications, and several small SAN arrays – one per server in a rack. By converting that configuration to a single clustered file server, the total cost of the solution was dropped dramatically: ~$50,000 in FibreChannel hardware was saved, and the cost savings in moving from multiple FC arrays to the clustered fileserver was very substantial.
The kicker though was that the performance of the solution was better than the original configuration, as previously it had bottlenecked on the
Storage Processors in the arrays.
So, the overall cost is substantially lower, the required features are delivered, and the performance is improved.