Exchange 2007 Service Pack 1 introduces several changes in the Extensible Storage Engine (ESE). In my first blog article on the subject, I discussed the removal of page dependencies and the disabling of partial merges. In this blog, I discuss other changes we made to ESE which enhance Exchange.
- Passive Node I/O improvements
- Online Defragmentation
- Checksumming databases
- Page Zeroing
The extra I/O on the passive node was the result of the design of the replay function. In RTM, when log replay starts on the passive node, an instance of ESE is started and used to replay the replicated logs. During this replay activity, pages are read from the database, which in turn populates the ESE database cache. When log replay has finished, the Replication Services stops and discards the database instance, thereby deleting database cache that was built up during the replay process. As a result, we see spikes in activity. When log replay is not behind, the cache will continually be small, and thus more read I/Os will occur against the passive node's disks. When log replay gets behind, the instance of ESE will remain active longer, and we obtain a larger database cache, which decreases read I/Os.
By itself, the additional disk I/O on the passive node is not a problem. But there are two scenarios where this additional I/O can have a significant impact - backups and storage design. Consider the scenario where you are performing VSS backups on the passive node. The additional disk I/O on the passive node can interfere with your ability to take a backup during the core user activity window, thus forcing you to schedule backups at off-hours. This negates the advantage of being able to backup storage groups on the passive node.
The other scenario where the additional I/O is when storage is being shared. Consider these scenarios:
- You are sharing disk spindles with multiple mailbox servers.
- You are sharing the storage controller with multiple mailbox servers.
- Allows the checkpoint to advance during recovery.
- Keeps the database cache "warm", which improves failover times.
- Allows the database cache to grow, which reduces read I/Os.
- Ensures there is no competing I/O that affects when you perform backups against the passive node.
Online Defragmentation
If you read my previous blog article on the ESE SP1 changes, you know that we reduced database churn significantly by disabling partial merges during online defragmentation (OLD). However, one other challenge remained - the challenge of determining how often you should run online defragmentation. Because Exchange had no metrics that could determine how often OLD should be run, our guidance always been to make sure that online defragmentation completes every week or every two weeks.
Because each environment is different, this guidance was not optimal for every organization. We could not say with absolute certainty that completing OLD every week or every two weeks would be acceptable in every environment. Furthermore, there was always the question of how much time you should allocate to online maintenance.
To address this in SP1, we added logic that allows you to determine how often you need to complete OLD. SP1 now includes new performance counters you can collection to determine this:
- MSExchangeDatabase -> Online Defrag Pages Freed/sec
- MSExchangeDatabase -> Online Defrag Pages Read/sec
Event Type: Information Event Source: ESE Event Category: Online Defragmentation Event ID: 703 Description: MSExchangeIS (19052) SG06: Online defragmentation has completed the resumed pass on database 'e:\MDB06\priv06.edb', freeing 42794 pages. This pass started on 6/16/2007 and ran for a total of 124919 seconds, requiring 7 invocations over 4 days. Since the database was created it has been fully defragmented 14 times over 73 days.The revised event now provides the following new information:
- When the OLD pass started.
- How many passes it took to complete.
- How many times the database has actually had OLD complete since database creation.
- You can utilize the streaming backup API to backup the active copy.
- You can utilize Volume Shadow Services (VSS) to backup the active copy.
- You can utilize VSS to backup the passive copy.
- Perform a handoff (CCR) or activate the passive copy (LCR) and then perform a backup. While this sounds easy, it is operationally a mess because you have to constantly perform handoffs / activations in order to check each copy.
- Utilize an Exchange-aware VSS requestor. We'll discuss the ramifications of this in a later section.
- Take a snapshot of the passive copy.
- Suspend continuous replication for all storage groups hosted on the volume containing the databases to be checksummed.
- Use vssadmin.exe (which is included in Windows Server 2003) to create a shadow copy of the volume containing the databases to be checksummed. e.g. "vssadmin create shadow /for=<volume>"
- Resume continuous replication for all storage groups hosted on the volume.
- Run eseutil /k /p against the database(s) on the shadow copy of the volume) e.g. "eseutil /k /p20 <Path for VSS Shadow Copy of Database>"
- After verification has completed successfully, delete the volume shadow copy. E.g. "vssadmin delete shadow /For=<volume>"
- perform a handoff/ activation and back it up
- perform a streaming copy backup
- perform an offline backup and utilize ESEUTIL
|
DB's Checksummed in Parallel |
DB Pages Read / s |
DB Read MB / s |
DB Read Latency (ms) |
% Processor |
|
1 |
30000 |
250 |
3.5 |
7 |
|
2 |
8200 |
67 |
7.2 |
2.5 |
Event Type: Information Event Source: ESE Event Category: Online Defragmentation Event ID: 721 Description: MSExchangeIS (6584) Third Storage Group: Online Maintenance Database Checksumming background task has completed for database 'J:\sg3\priv3.edb'. This pass started on 6/19/2007 and ran for a total of 208 seconds, requiring 2 invocations over 1 days. Operation summary: 5850768 pages seen 0 bad checksums 72682 uninitialized pagesIn addition, we have added two new performance counters as well (which is how we obtained the data for the previous table):
- MSExchange Database -> Online Maintenance (DB Scan) Pages Read
- MSExchange Database -> Online Maintenance (DB Scan) Pages Read/sec
Registry Hive: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\ParametersSystem DWORD Key: Online Maintenance Checksum DWORD Value: 1 (enabled), 0 (disabled) DWORD Key: Throttle Checksum DWORD Value: <number of milliseconds to sleep between sequential read batches>For more information, please see: http://technet.microsoft.com/en-us/library/bb676537.aspx Online Maintenance Checksum vs ESEUTIL There is a difference between the checksum process implemented in the Online Checksum task (streaming backup API or the online maintenance task) and the process used by ESEUTIL /K. Fundamentally, they both do the same thing in terms of how they check the pages. But there is a differencein their performance characteristics. The Online Checksum task uses a method known as JetDatabaseScan(). JetDatabaseScan() has a loop that issues a pre-read for 320KB of pages, scans the pages, and optionally sleeps. ESEUTIL /K, on the other hand, issues 1024 64KB read I/Os and when a read completes, the buffer is checksummed and another read is issued. Like the Online Checksum task, ESEUTIL can also optionally sleep for a configurable amount of time after issuing a certain number of reads. But the net difference here is that the Online Checksum task performs very well, and it is kinder on the disk subsystem than ESEUTIL. Page Zeroing Page Zeroing is a security option that allows empty pages to be overwritten using a pattern based on where the page sits within the B+ tree, so that deleted data cannot be recovered. With Exchange 2007 RTM and all previous versions, page zeroing operations happened during the streaming backup process. In addition since they occurred during the streaming backup process they were not a logged operation (e.g., page zeroing did not result in the generation of log files). This poses a problem in Exchange 2007 when you are using continuous replication, or when you are performing VSS backups. In the continuous replication scenario, the passive copy would never get its empty pages zeroed, and the active copy would only have its pages zeroed if you performed a streaming backup. In the VSS backup scenario, the database (active or passive copy) would never get its pages zeroed. To address these scenarios in SP1, we have introduced a new online maintenance task, Zero Database Pages During Checksum. It is an optional task that is disabled by default, as it could affect server performance. This is a logged operation that will get replicated to the passive copy, thereby ensuring that both database copies are updated. Online Page Zeroing, like the Online Maintenance Checksum task, performs large sequential reads (320KB), but is different from the Online Maintenance Checksum process in that it also generates random database writes (160KB). Fortunately, even if you enable both tasks, there is only a single database scan task in which both page zeroing and Online Maintenance Checksumming are done when either one is enabled (and incidentally, you have to enable the Online Maintenance Checksum task in order to enable Online Page Zeroing). This one task will retrieve the page from disk and perform both operations. As with Online Maintenance Checksumming, Online Page Zeroing performs better when you stagger online maintenance so that only one database is checked per LUN. However, there are a few things to keep in mind.
- When you initially enable Online Page Zeroing, the scan can place tremendous pressure on the database cache. To ensure this does not affect your server's performance, we recommend that you either implement the Throttle Checksum registry entry (mentioned in the previous section) or stagger your online maintenance window. Once the initial pass is completed, subsequent passes are much less intensive and will not impact the database cache significantly. Therefore, as a best practice, if you require page zeroing, consider enabling page zeroing on the database at creation time so that you will never have this first pass performance spike.
- Online Page Zeroing is very similar to a streaming backup (with page zeroing enabled). It reads from and writes to the database. Reads are sequential, but the writes are random. In addition, there is a slight processor use increase as a result, as well as about a 20% increase in RPC average latency while the database scan is occurring. As always, the best practice is to not execute online maintenance during the peak user activity window, and this is still true for the Online Page Zeroing task.
|
DB's Page Zeroed in Parallel |
DB Pages Zeroed / s |
DB Read MB / s |
DB Write MB / s |
DB Read Latency (ms) |
% Processor |
|
1 |
8100 |
68 |
66 |
3.4 |
7.5 |
|
2 |
6800 |
65 |
50 |
7.2 |
2.5 |
Event Type: Information Event Source: ESE Event Category: Online Defragmentation Event ID: 722 Description: MSExchangeIS (6544) Third Storage Group: Online Maintenance Database Zeroing background task has completed for database 'J:\sg3\priv3.edb'. This pass started on 6/20/2007 and ran for a total of 369 seconds, requiring 1 invocations over 1 days. Operation summary: 5850768 pages seen 0 bad checksums 72681 uninitialized pages 4379723 pages unchanged since last zero 33759 unused pages zeroed 1210764 used pages seen 57214 deleted records zeroed 0 unreferenced data chunks zeroedIn addition, we have added two new performance counters as well (which is how we obtained the data for the previous table):
- MSExchange Database -> Online Maintenance (DB Scan) Pages Zeroed
- MSExchange Database -> Online Maintenance (DB Scan) Pages Zeroed/sec
Location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeIS\ParametersSystem DWORD Key: Online Maintenance Checksum DWORD Value: 1 (enabled), 0 (disabled) DWORD Key: Zero Database Pages During Checksum DWORD Value: 1 (enabled), 0 (disabled) DWORD Key: Throttle Checksum DWORD Value: <number of milliseconds to sleep between sequential read batches>For more information, please see: http://technet.microsoft.com/en-us/library/bb676537.aspx Conclusion To summarize, SP1 enhances Microsoft Exchange manageability by allowing you to ensure that all databases are healthy, to measure when you should perform online defragmentation, and to ensure that page zeroing activity is replicated to each copy of every database. - Ross Smith IV
Updated Jul 01, 2019
Version 2.0The_Exchange_Team
Platinum Contributor
Joined April 19, 2019
Exchange Team Blog
You Had Me at EHLO.