EDIT 8/6/2008: Please see this post for additional information related to Hub throughput and different message sizes.
Large message size: effect of transport database cache size on throughput.
Recently one of our support engineers came to us requesting performance data for a client deploying Exchange 2007 SP1 (E2K7 from now on).
The client wanted to know what level of steady state throughput was achievable by a Hub Transport server receiving 4 widely different average message sizes:
We had some of the data but needed to complete the table, so we employed the test bed used to measure transport performance for E2K7 and E2K7 SP1.
Hub Hardware: 2 processors x 2 core, 2.2 GHz, 800 MHz FSB, 1MB L2 cache per core, 4 GB RAM, 400 MHz memory, Ultra 3 SCSI disk controller ("entry level") with 128 MB Write-Back Cache, 3 x Ultra320 Universal SCSI 15K RPM disk.
Optimized E2K7 transport database queue configuration:
Transport dumpster is not being used in this environment: 1 Hub and 1 Mailbox without replication.
Mailbox Hardware: A "good" Mailbox server with enough CPU cycles and storage bandwidth to accept message delivery without slowing down the Hub.
Gigabit network.
The battery of tests was based on the benchmarking automation we used during Exchange 2007 development, changing the "message mix" for each test to inject a different average message size (25KB, 1MB...).
The benchmarking infrastructure is designed to inject messages into transport through a SMTP receive connector, at constant speed, seeking a steady state throughput, while monitoring baseline performance counters.
Ideally the test stabilizes after few minutes of "warm up" flow when the DB cache reaches a stable size (128MB if using the default DatabaseMaxCacheSize setting). Steady state is achieved by looking at:
Yes, I said "ideally," but sometimes the test doesn't stabilize: throughput oscillates reaching 0 frequently or a queue builds-up (Remote, Delivery or Submission queues).
Then you have to work a bit to understand why. Start the investigation by looking at the server EventLogs.
One possibility is heavy resource pressure in which case transport decides to apply back pressure on the system, indicated by Event Log ID 15004. Looking at the event you will find details on what resource is under strain. You can see an example of this in the 3rd test of the suite shown below.
Then you have to diagnose why server went into backpressure, like 3rd test did below. At the end of the post you'll find some more data what to look for when analyzing performance bottlenecks.
Cache Size |
128MB DB Cache |
512MB DB Cache |
|||
Limiting Resource |
Test 1 CPUBound |
Test 2 IOBound |
Test 3 Configuration Bound* |
Test 4 IOBound |
Test 5 IOBound |
Message Size |
25KB |
1MB |
5MB |
5MB |
10MB |
SMTP Receive Throughput (msg/sec) |
159.32 |
14.05 |
0.40 |
2.03 |
1.34 |
Aggregate Queue length (MAX) |
329 |
63 |
29 |
27 |
2 |
Queue size in MB (MAX) |
8.65 |
64.51 |
148.48 |
138.24 |
20.48 |
%CPU |
69.86 |
56.03 |
15.37 |
48.00 |
40.03 |
Msg Cost (MCyc/msg) |
38.68 |
351.07 |
3131.76 |
2068.69 |
2591.67 |
Msg Cost (MCyc/ByteOfMsg) |
1470.72 |
342.84 |
611.67 |
404.04 |
253.09 |
Disk Writes/sec (log) |
92.80 |
185.00 |
133.00 |
181.00 |
|
Disk Writes/sec (queue) |
35.30 |
729.00 |
876.00 |
622.00 |
|
Disk WriteKB/sec (log) |
9,876 |
32,800 |
23,279 |
30,796 |
|
Disk WriteKB/sec (queue) |
1,086 |
23,900 |
19,788 |
23,556 |
|
Disk Writes/msg (log) |
0.58 |
13.17 |
64.53 |
138.17 |
|
Disk Writes/msg (queue) |
0.22 |
51.89 |
425.04 |
474.81 |
|
Disk WriteKB/msg (log) |
61.99 |
2,335 |
11,295 |
23,508 |
|
Disk WriteKB/msg (queue) |
6.82 |
1,701 |
9,601 |
17,982 |
|
Disk Reads/sec (log) |
0.00 |
0.00 |
0.00 |
0.00 |
0.00 |
Disk reads/sec(queue) |
0.00 |
0.00 |
567 |
0.00 |
0.00 |
*Back pressure, High Version Buckets: Event Log ID 15004
In the 3rd test, with transport service rapidly transitioning on and off from back pressure, disk counters show a heavily serrated pattern; therefore averages are not computed accurately by perfmon. In this case the inaccurate values were left out of the chart.
Nevertheless, throughput on that test is computed by the following ratio: (Total Messages Received)/(Test Duration), so it's accurate. See below for summary data that compares the two 5MB runs.
After testing the first 2 message sizes (25KB and 1MB), we couldn't reach steady state throughput on the 3rd and 4th test with default server settings.
Attempting to inject steady flow of the large messages (5MB) triggered back pressure, with the well known Event Log ID 15004, claiming version buckets are above high watermark.
The first suspect to examine when version buckets are high is disk I/O performance. We immediately discovered that the flow of large messages contributes to a large queue length. In this case, the queue "only" contained 29 messages, but with the large message size being received this translates to 149MB on the queue overflowing the database cache default size of 128MB.
In the table above, notice that the queue size (in MB) never approached the DB cache size in previous. Looking at the disk counters we found that the overflowing of the cache triggered a large amount of disk reads, which don't appear in the regular steady state tests.
To avoid overflowing the cache and triggering back pressure, we decided to experiment with increasing the transport DB cache size. Initially we tested with a 1GB cache, but found that 512MB (up from the default 128MB) was enough to eliminate the overhead of additional disk reads associated with the flow of very large messages.
Here is a fragment from the EdgeTransport.exe.config file that shows the changes made:
<configuration>
<runtime>
<gcServer enabled="true" />
</runtime>
<appSettings>
<!-- Optimized Transport DB storage -->
<add key="QueueDatabasePath" value="e:\data\"/>
<add key="QueueDatabaseLoggingPath" value="c:\logfiles\"/>
....
<!-For very large message test: commented default 128M DB Cache -->
<!-- add key="DatabaseMaxCacheSize" value="134217728" / -->
<!-Using 512M DB Cache: -->
< add key="DatabaseMaxCacheSize" value="536870912" />
...
</appSettings>
Additionally, here are a few more interesting statistics for the test that triggers back pressure: 31% of the time server is not receiving messages; the throughput for the non back pressure windows is only 0.57 msg/sec, compared to steady 2.03 msg/sec when back pressure is avoided by using a bigger DB cache.
5MB message size stats â?? Back pressure vs. Steady state
|
||
Database Cache size (MB)
|
128
|
512
|
Duration (min)
|
20
|
20
|
Total Messages Received
|
475
|
2436
|
# of Transitions into back pressure
|
41
|
0
|
Total Minutes in back pressure mode.
|
6.17
|
0.00
|
% of Time in back pressure
|
31%
|
0%
|
Max back pressure Windows (sec)
|
65
|
0
|
Average Throughput (msg/sec)
|
0.40
|
2.03
|
Throughput for the non back pressure intervals
|
0.57
|
2.03
|
Bill Thompson, from the Exchange Center of Excellence, on his New maximum database cache size guidance for Exchange 2007 Hub Transport Server role blog post has the official guidance on the DatabaseMaxCacheSize settings to use.
A disclaimer: storage is key for transport performance, all the above data only applies to a Hub server with at least an "entry level" SCSI controller with 128 MB of BBWC (battery backed write-back cache) that optimizes the IO pattern transport performs on steady state flow: continuous writes with very few or no reads.
Some useful counters when doing E2K7 transport benchmarking:
1. Throughput counters
MSExchangeTransport SmtpReceive(_total)\Average bytes/message
MSExchangeTransport SmtpReceive(_total)\Messages Received/sec
MSExchangeTransport SmtpSend(_total)\Messages Sent/sec
MSExchange Store Driver(_total)\Inbound: MessageDeliveryAttemptsPerSecond
MSExchange Store Driver(_total)\Inbound: Recipients Delivered Per Second
MSExchangeTransport Queues(_total)\Messages Queued for Delivery Per Second
MSExchangeTransport Queues(_total)\Messages Completed Delivery Per Second
2. Queue counters, others
MSExchangeTransport Queues(_total)\Aggregate Delivery Queue Length (All Queues)
MSExchangeTransport Queues(_total)\Active Remote Delivery Queue Length
MSExchangeTransport Queues(_total)\Active Mailbox Delivery Queue Length
MSExchangeTransport Queues(_total)\Submission Queue Length
MSExchangeTransport DSN(_total)\Failure DSNs Total
MSExchangeTransport Dumpster\Dumpster Size
MSExchange Database(edgetransport)\Database Cache Size (MB)
MSExchange Database(edgetransport)\Version buckets allocated
3. Accessory counters to diagnose if CPU, Disk bound, Network, see Bottleneck-Detection Counters
PhysicalDisk(_Total)\Current Disk Queue Length
PhysicalDisk(_Total)\Disk Writes/sec
PhysicalDisk(_Total)\Disk Reads/sec
PhysicalDisk(_Total)\Avg. Disk sec/Write
PhysicalDisk(_Total)\Avg. Disk sec/Read
Processor(_Total)\% Processor Time
Process(Edgetransport)\% Processor Time
Process(Edgetransport)\Private Bytes
Memory\Available MBytes
Network Interface\ Bytes Total/sec
.....
If you're wondering how the results differ for other average sizes, we'll be posting more data on some other sizes (40KB, 70KB) later, so stay tuned.
We are currently testing servers with different storage: SATA disk, 7200 RPM, without the advantage of BBWC. More data on this scenario will be coming in a future blog post.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.