Blog Post

Exchange Team Blog
6 MIN READ

Some more thoughts on disk IO and calculations….

The_Exchange_Team's avatar
Nov 03, 2004

Thanks to everyone whose posted questions and comments to my previous blog post!  Since I’ve had so many questions on calculating IO, I thought I’d go into a bit more.
 
First, there are many ways to calculate the disk IO per user.  
 
Measuring IO:  What to measure:
In my blog, I recommended that you measure the PhysicalDisk\Disk Transfers per second. Depending on your hardware situation, you may want to use LogicalDisk\Disk Transfers per second. If LogicalDisk counters aren’t enabled, you might be able to use the PhysicalDisk counters. The important thing, of course, is that you are measuring the amount of IO reads and writes per second to the database drives.  
 
I also recommend you measure IOs per second on the SMTP queue drives, log drives, and temp drive, because any of these drives may be a bottleneck. Most of the literature focuses on the database drives, because these tend to have the greatest rate of IO, but it’s important to remember the little guys too and make sure all the drives used by the Exchange server have enough throughput capacity to meet your company’s needs.
 
Measuring IO:  When to measure:
The question of when to measure keeps coming up. The important thing here is that the disks can support "maximum" sustained peak load. What do I mean by this? Practically speaking, sustained peak load is the load that is generated during the busiest time of the day, on the busiest day of the week. As I mentioned in the last blog, this is 9am to 11am on a Monday. When I monitor the servers, I calculate the average IOs per second during that busy 2-hour window. I am not interested in the maximum value that occurs during that time because I expect to see peaks and dips in IO rate - as long as these peaks are short in duration,  it is relatively safe to ignore them.  
 
So, if you have disk data from a many-hour measurement, I recommend looking at this in Perfmon. Perfmon allows you to select windows and find an average between two points. Find the largest sustained peak, and use the average IO rate from this peak in the calculations. 
 
Measuring IO: How many users is that again?
Determining how many users are active can be a tricky problem. It helps if you know a bit about the company’s email usage. For example, if you are lazy, and know that most of the users use mail, then you can use the total number of mailboxes for the number of users. Using the total number of mailboxes is what the Exchange 2003 performance and scalability guide recommends:
 
IOPS/mailbox = (average disk transfer/sec) ÷ (number of mailboxes).
   
This calculation works well in most cases. For example, people at Microsoft use mail a lot J, and during peak hours about 80% of the users are active, so this is a reasonable estimate for a company whose users have a passion for email. On the other hand, some companies don’t use email as frequently, or have users who work in shifts, so the number of concurrent active users might be much smaller.  In those cases, you may choose to use MSExchangeIS\Active User Count, or count the number of unique logons in ESM during peak hours.
 
In the last blog, I recommended that you use the MSExchangeIS\Active User Count counter. This is generally a good counter to use, but it does have a few gotchas I should mention. This counter gives the number of unique users that have logged on to the server and been active in the last 10 minutes. However, this number can be larger than the number of mailboxes on the server. This is primarily because of two things: if the server is a public folder server, it includes the users that are logged on to public folders. These can be users from other servers. Additionally, it includes users who are logged on to other user’s mailboxes (such as in the case of checking calendar details, if the users have shared their calendars).    This last reason usually has negligible impact - in most cases, it doesn’t account for many logons.
 
As an alternative, if you want to be more accurate about the number of users accessing a particular mdb, then you can look in the Exchange System Manager and count the number of unique logons for each mdb (drill down to administrative groups \<administrative group name> \ Servers \ <server name>\ Storage group\ <mailbox name>\ Logons).  Make sure you don’t count the same user twice.
 
When it comes down to how many disks you need, you only need to know the expected maximum disk throughput, which you can measure without knowing how many users are on the server. However, if you are planning to build a new server, and want to estimate how many IOPS to plan for, then it’s useful to know the IOs per user. As long as you are consistent in how you measure users, and know how many users you will have on the new  server, you should be able to estimate how much IO you will need - which is why we care about this in the first place. It might look like I’ve danced around the question "what is the best way to measure the number of users".  I suppose that’s because the answer depends on the situation. For simplicity, the method recommended in the Exchange 2003 performance and scalability guide is probably the best. I think MSExchangeIS\Active User Count is also good, especially if you have dedicated mailbox servers (with the public folder on another server), and it’s easy to measure quickly. And finally, if you are really detail-oriented, then you can measure unique logons via ESM.   
 
IO:  Where did the estimated maximum throughput numbers come from?

A couple people asked about the calculations, so I’ll put my little formula here:
 
Estimated maximum throughput =    D * F * T
 
D = disk speed. The maximum rate of IOPS measured by the disk manufacturer.
F = fudge factor. I used 0.8 (80% usage) to build in some overhead. This is necessary to plan for enough IO to handle occasional extremely high loads.
T = raid factor. This depends on the type of raid and the read/write ratio.
 
For no raid or raid 0, the raid factor is
 
T  = 1                                                               (no raid or raid 0)
 
For raid 10, the raid factor is:
 
T =   (R + W)/(R + 2W)                                   (Raid 10)
 
This ratio comes about because there are 2 disk IOs for every write.  Thus, the throughput is reduced by the ratio  (R+W)/(R+2W).
 
For raid 5, the raid factor is:
 
T = (R + W)/(R + 4W)
 
This ratio comes about because there are 4 disk IOs for every write in a Raid 5 configuration.
 
R and W are the number of reads and writes to the drives. You can calculate your own raid factor by measuring the number of reads and writes using the LogicalD isk\Disk Reads/sec and LogicalDisk\Disk Writes/sec. For example, if the number of reads to the database drives is 700, and the number of writes is 400, then the raid factor for Raid 5 would be
 
T = (700 + 400)/(700 + 4*400) = 1100/1900  =  0.48
 
Now I’ll walk through one calculation from my table. Let’s calculate the estimated maximum throughput per disk in a Raid 5 configuration for a R:W ratio of 3:1. Let’s use disks with a maximum raw throughput of 180 IOs per second, so D is 180. 
 
For a R:W ratio of 3:1, Raid 5, the raid factor is
 
T = (3 + 1)/(3+4*1) = 4/7 = 0.57
 
Thus,
 
Estimated maximum throughput =    D * F * T = 180 * 0.8 * 0.57 = 82.
 
Phew! Thanks for sticking with me this long - hopefully I haven’t introduced to many new questions in your minds. But if I have, feel free to post more questions.
 
- Nicole Allen

Updated Jul 01, 2019
Version 2.0
  • How does a SAN and the write-back cache affect these numbers? Do I still need to allocate spindles according to I/O, or does the cache raise the number of I/Os per disk signifigantly?

  • Do you have metrics for write latency when using SAN synchronous replication for the log and database drives?
  • David - The number of spindles you have backing your databases needs to be based on throughput (the IOs per second). You should not depend on WriteBack Cache to reduce the numbers of IO. Since the database writes are random, you can’t expect coalescing to significantly reduce the number of writes. The advantage of the write back is that writes to the cache return the OS immediately – this reduces the write latency, but does not increase the overall throughput. The data still needs to be written to disk. That means if you overburden it, it will just fill the cache because the disks won't be able to keep up. For normal IO on a healthy diskset, you should see writes around 10-15ms (factoring some queue delay, head seeking, and data transfer time). With WriteBack caching, you'll see 1-5 ms write times. In summary: write cache reduces the latencies on a well performing system, but does buy any reduction in the number of IOS.
  • David - I was unable to post a response to your previous question on the earlier blog because the blog is closed for comments...so I'm posting a reply here:

    Generally, read cache isn’t as helpful as writeback cache. However, with the extent to which your disks are overused, that may not be sufficient. Have you considered going to Raid10?
    -Nicole
  • Sorry, we don't have anything ready for publication at this time.
  • Based on the questions that we got on another post, it seemed appropriate to address the &quot;Requesting...
  • Based on the questions that we got on another post, it seemed appropriate to address the &quot;Requesting...
  • Based on the questions that we got on another post, it seemed appropriate to address the &quot;Requesting...