Great article, thank you for the information.
In recent years I have had the pleasure of migrating a Messaging system from Exchange 2003 to Exchange 2010. We employed a 3rd party stubbing and archiving solution in Exchange 2003, and have since brought online the same 3rd party stubbing and archiving solution for our Exchange 2010 environment.
With Exchange 2003 we experienced a large space savings with stubbing. As mentioned above, the 4k page files in Exchange 2003 and the "old online defrag" process must have contributed to these savings.
In Exchange 2010 we experience a great loss of space savings over time with stubbing. In our environment thousands of emails are stubbed daily. We must stub items due to the fact we are highly consolidated and due to the high load our system has.
Databases grow consistently over time and space, which we are calling "Unaccountable", grows as well.
One example of an Exchange 2010 database that is 8 months old:
Database-A:
File Size: 170GB
Sum of Mailboxes: 88.6GB
Whitespace: 1.1GB
Total Version/Dumpster store: 2.6GB
Items stubbed during 8 months: Hundreds of thousands
Unaccountable Space: 77GB
If a new database is created and the mailbox data moved to the new database, the size of new database will be close to 90-95GB. This is probably due to the mailbox moves placing data in a contiguous manner.
Our Exchange database environment totals 4.2TB in files, however our mailbox data totals only 2.1GB. We employ a manual process of database recreation often to keep our storage under a certain total size. Constant database recreation (although easier in Exch2010) is not one of our desired weekly routines.
Although Exchange 2010 has implemented a native archiving solution, there are many customers that have legacy 3rd party repositories that would be hard to break from. The size of our native Exchange Archive mailboxes would require us to keep tens of terabytes of databases online to serve what we are required to offer our users in real-time. The 3rd party solution we use allows us to provide even lower cost storage than an Exchange database requires for this old data, and gives us back single item recovery which was lost in Exchange 2010 (for understandable design reasons).
If it is the redesign of the nightly Online Maintenance routine that took away this intense form of mailbox content processing, there may be some enterprise customers still requiring this routine to function as before or at least close.
If it were possible to set a level of intrusiveness for the Online Maintenance routine, a Microsoft Exchange 2010 customer could set the balance between Disk I/o and database space reclamation that their system could handle.
Due to the high cost of requiring double the amount of space in our enterprise environment, we are working our way through our second Microsoft Professional Support case on this topic. I thank you for the above information. Hopefully our case will come to an enterprise level solution.