The other day I was chatting with one of our Supportability Program Managers, Nino Bilic, and he mentioned something that was rather alarming - the number one reason why our Premier customers open Exchange 2010 critical situations is because Mailbox databases dismount due to running out of disk space on the transaction log LUN.
I’ll let that sink in for a moment. Naturally I’m shocked…to be completely honest, I thought with the Mailbox Requirements Calculator and our guidance on TechNet, we’d have wiped out this issue by now. After sharing this information with me, Nino decided that I, not he, should write a blog article on the topic of transaction log capacity planning (gee, thanks Nino!).
In order to properly size a transaction log LUN, we need to understand a few things about the environment:
For the purposes of this discussion, let’s assume that each database will house 250 mailboxes. Each mailbox sends/receives a 150 messages per day, with an average message size of 100KB. Based on the table in Understanding Mailbox Database and Log Capacity Factors, we know that a 150 message profile with a 75KB average message size generates 30 transaction logs per day (24 hour period). Since our message size is greater than 75KB, we need to account for that in our transaction logs per mailbox generation. The guidance stipulates:
If the average message size doubles to 150 KB, the logs generated per mailbox increases by a factor of 1.9. This number represents the percentage of the database that contains the attachments and message tables (message bodies and attachments).
Therefore, we can determine the impact our 100KB average message size has with this formula:
150 / 1.9 = [average message size of profile] / x
x = (100 * 1.9) / 150
x = 1.266666666666667 ~ 1.27
So by having a message size that is 25KB larger than the baseline, the number of transaction logs generated per day per mailbox increases by a factor of 1.27. Therefore, 30 transaction logs * 1.27 = 39 transaction logs / day / mailbox. This means, that for a database of 250 mailboxes, each database will generate 39 * 250 = 9,750 mailbox generated transaction logs / day / database.
Mailbox moves also generate transaction logs. Each mailbox moved to the destination database generates roughly enough logs (at the destination, not the source) that equal the size of the mailbox (including the contents in the Recoverable Items folders). For example, moving 1% of the mailboxes per day will mean that 2.5 mailboxes are moved into a database each day. If each mailbox is 5.4GB in size on average (including 14 day deleted item retention with Single Item Recovery enabled), then 2.5 * 5.4GB / 1024 = 13,888 mailbox move transaction logs / day / database.
From a backup/restore perspective, we need to take into account the type of backup architecture we are leveraging. With each backup scenario, there is a recommended number of additional days you should provision from a capacity perspective for your mailbox generated transaction logs. By provisioning extra space, you can survive multiple failures without suffering an outage event. For more information on transaction log truncation, see Understanding Backup, Restore and Disaster Recovery.
Transaction Log Truncation | Recommended Backup Failure Protection | |
Daily Full Backup | Daily | 3 days |
Weekly Full Backup / Daily Incremental | Daily | 3 days |
Weekly Full Backup / Daily Differential | Weekly | 7 days |
Bi-Monthly Full Backup / Daily Incremental | Daily | 3 days |
Exchange Native Data Protection | As logs are no longer required | 3 days |
Of course, there are other scenarios that you may need to consider. For example, if you are deploying a stretched Database Availability Group (DAG) across two datacenters, log truncation will only occur if the network link between the two datacenters is operational and the database copies are healthy. If you know that an outage of the WAN link could take 5 days to repair, you should adjust your backup failure protection to take that into account.
For our scenario, let’s assume we only need to ensure we can survive 3 days of truncation failure events. This means that we need 9,750 / 1024 * 3 = 28.5GB of disk space for our mailbox generated transaction logs.
In addition, we need to account for the amount of disk space required for our mailbox move events for the entire week: 13,888 / 1014 * 7 days = 94.9GB of disk space for our mailbox move operations.
All told, this means that each database needs 123GB of disk space for transaction logs. We should also include a data overhead factor as well, to account for any unexplained phenomenon that may occur: 123GB * 1.2 = 148GB of disk space for transaction logs.
If we are deploying a dedicated LUN for the transaction logs, we would not provision a LUN of 150GB as that would mean that we could consume all of the disk space if we were having backup failures and excessive mailbox moves. Typically you want to ensure that each LUN is provisioned such that only 80% of the disk capacity is utilized. The formula is:
LUN Space = [projected disk space utilization] / (1 – [desired free space percentage])
LUN Space = 148GB / (1 – .2) = 148GB / .8 = 185GB LUN Space for Dedicated Transaction Log Volume
If you are deploying the transaction logs on the same LUN as the database, you would simply combine the transaction log disk space requirements with the database disk space requirements for the [projected disk space utilization] value.
First and foremost you need to obtain a baseline of your environment to determine you typical log generation rate per day. In addition, you must setup monitoring and take action on any alerts that are generated. Monitoring should monitor for the following scenarios:
My friend, Mike Lagase, wrote a great article on how to troubleshoot this scenario - http://blogs.technet.com/b/mikelag/archive/2009/07/12/troubleshooting-store-log-database-growth-issu... (please note that the article was written with Exchange 2007 in mind, so several of the tools and/or recommendations may no longer apply with Exchange 2010). In addition to the steps Mike mentions, you can utilize the following in Exchange 2010 to help determine the unexplained transaction log growth (thanks to Todd Luttinen for putting this list together):
[PS] C:\>$stats = Get-StoreUsageStatistics –Database <Database Name>
[PS] C:\>$stats | ? {$_.DigestCategory -eq 'LogBytes'} | group MailboxGuid |sort count -Descending | Select -first 1 -ExpandProperty Group | sort SampleTime | ft -a MailboxGuid,Sample*,Log*
MailboxGuid | SampleID | SampleTime | LogRecordCount | LogRecordBytes |
c007c87a-e030-4414-b741-9cf61e88b9de | 5 | 11/7/2011 4:25:05 PM | 237 | 274163 |
c007c87a-e030-4414-b741-9cf61e88b9de | 4 | 11/7/2011 4:35:05 PM | 451 | 387362 |
c007c87a-e030-4414-b741-9cf61e88b9de | 3 | 11/7/2011 4:45:06 PM | 483 | 144999 |
c007c87a-e030-4414-b741-9cf61e88b9de | 2 | 11/7/2011 4:55:06 PM | 734 | 293433 |
c007c87a-e030-4414-b741-9cf61e88b9de | 1 | 11/7/2011 5:05:06 PM | 933 | 411485 |
c007c87a-e030-4414-b741-9cf61e88b9de | 0 | 11/7/2011 5:15:06 PM | 247 | 209987 |
Starting from <date/time> service <name> has performed this activity on the server:
RPC Operations: 24168.
Database Pages Read: 1329 (of which 629 pages preread).
Database Pages Updated: 12418 (of which 11555 pages reupdated).
Database Log Records Generated: 13906.
Database Log Records Bytes Generated: 660331.
Time in Server: 19142 ms.
Time in User Mode: 6100 ms.
Time in Kernel Mode: 63 ms.
I think all of us understand how critical it is to ensure that there is enough capacity to ensure that your database availability is not affected. Hopefully this information helps in planning your transaction log capacity.
Ross Smith IV
Principal Program Manager
Exchange Customer Experience
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.