Loading the Active Directory Database Into RAM
Here’s another question we get asked occasionally: is there a way to load the entire Active Directory database into RAM?
The idea behind this question is that having the sought after data in physical RAM would prevent the delays of seek time and paging which even the fastest hard drives have to greater or lesser degree. This question is much more likely to be asked by a company who has been reading our Server 2003 and later marketing information which states:
By moving to Windows Server 2003 x64 Editions, even quite large Active Directory implementations can be entirely memory resident, greatly improving the speed of queries and enabling a significant server consolidation of domain controllers supporting query-intensive applications such as Microsoft Exchange Server 2003.
So the motivation for loading the database into memory is to answer database queries more quickly and perhaps save some money on the number of domain controllers deployed.
But how do you do it? And how do you tell when you have your database in memory?
The answer to the first question is simple: there is no action for the administrator to do in order to load the AD database contents into memory. In the same way in x64 Windows Server as in x86, the Active Directory data is loaded into memory on demand. In other words, as a client requests a set of data that data is loaded into memory.
In the x64 case there is no practical upper limit on what can be loaded into physical memory as a benefit from the x64 architecture. What this means is that once the entire set of data has been accessed at least once it will be entirely resident in memory. The x86 architecture can never reach this extreme of performance, though it will do an excellent approximation based on keeping the most frequently sets of data in memory and paging out what is least used when needed.
The logical next question we typically hear from smart IT people is “could I write a script to access all the data so that all of that data is then residing in RAM?”. The answer to that question–for a good and practical reason-is no. In order for that idea to work you would need to run a query or queries which touches each object and attribute in the Active Directory database. This, by definition, is an inefficient and expensive query set and would bring the strongest server to its knees.
Keep in mind that the most frequently accessed data is most likely to be in memory early, and will stay there in memory until the server is rebooted. Over time the less frequently accessed data is more likely to be in memory as well. This results in the recommendation to not reboot your x64 domain controllers if at all possible since performance on queries will be quicker if the data being requested is already in memory, and that data is more likely to be in memory if the server has a lot of uptime. It would not be expected to be seen in the initial query response times following a reboot, but they certainly wouldn’t be as quick as they would otherwise be if a data set was already in memory.
That brings us to the question of how you can tell how much of your database is in memory.
There is no gauge or meter for this like the gas gauge on a car. The reason there is not such a mechanism is that there is simply no need for one-the AD will load into memory as needed, on demand.
Keep in mind that each Active Directory database is different. When I say that I imagine people thinking of different company’s AD implementations and visualizing the difference between big companies like Microsoft’s AD and a smaller company of, say, 200 users. You should expect to see differences between databases within domain replicas as well for DCs in the same domain. One reason for that is obvious-some replicas may be global catalogs as well. Other reasons may occur from database usage and maintenance. The most common reason is seeing whitespace in datatables within the AD that can be alleviated by defragmentation, or other database issues.
Which leads us to the practical part of looking to see how much of your AD is in memory.
To that you can do several things. The obvious thing to do is to look at the total size of the NTDS.DIT on disk compared to what the memory size of the lsass.exe process. This is a good guideline, but not the most granular thing to use since you will be using the size of the DIT file as the 100% marker and then looking to another marker to approximate how much of that is in memory.
So the Active Directory database runs within the lsass.exe process on your domain controllers, so viewing the memory usage for lsass.exe in Task Manager can give you an idea of how much memory is being used. The working set is the amount of memory kept in RAM for the process at that time. This appears simply as the “Mem Usage” column in Task Manager for Windows Server 2000 and 2003 but appears as “Working Set (Memory)” in Server 2008.
But can we use that for an accurate idea of how much of the database is in RAM? The answer is no. Check out the size of the AD database from the same from the same server as the Task Manager picture above in a picture of the size of that same server’s DIT file (below).
Significant difference there, right? The reason for this is that lsass.exe does other things in addition to running the code for AD. The netlogon service, for example, runs in lsass.exe. However, for larger databases this “gap” between how much of lsass.exe memory consumption is being used for AD specifically and how much for other local server tasks will be much less dramatic. The point here is to keep in mind it is very approximate.
From a more granular perspective there are other things you can do to get a good idea of your specific database size. Active Directory is comprised of datatables, indices and is stored in sets of data called pages. These things are the specific items an administrator should be thinking of when he or she considers getting AD into RAM for rapid query responsiveness.
Some general guidelines are that AD is arranged in 8K pages (unlike Exchange which, I understand, is 4K pages). For that 8K page you may store 2 users if the users are not large ones (4K each or so), meaning if they do not contain a large set of data. If you calculate how many users you have in that domain you can get an idea of how much memory may be needed for them in general.
Likewise, knowing the size of your long values tables, index tables and the like can go a long way to understanding how much memory can be used as well. To find out the size of these data sets you can use the command below, ran in Directory Services Restore Mode:
esentutl /ms “path to the ntds.dit” > dump-file.txt
A sample result (above) gives us the number of the entire datatable to be 1126 pages. 1126 multiplied times the page size of 8 would give us 9008K, or about 8.7 megabytes. Did I mention this is a test environment database? Of course it’s unlikely most people have that size datatable in their production environment. The point here is that if I see less than 8Mb of working set memory being used by lsass.exe (this is a general example here) then I can be reasonably assured that the datatable is not entirely in memory.
Note : Do not use esentutl.exe or eseutl.exe for other database management related activities. The supported tool for general database management (defrags, repairs) is ntdsutil.exe which is an AD-aware wrapper for the afore mentioned tools.
This technique can be used in any environment or need by using the esentul database information in conjunction with a SPA AD report or LDAP performance data from using the Field Engineering events I described previously in this blog post.
Further, an AD admin with a little time on his or her hands can gather a baseline of what to expect for their environment (and this kind of thing is subjective enough to need this) by watching the Lsass.exe process’ Working Set Memory over time in Perfmon. Good data to have on hand in order to understand what your environment looks like right now so that you can understand how it’s changing over time, or when you have a problem.
So to sum all of this up it is not unlikely that you will ever see all of your Active Directory database loaded into RAM, though over time and usage it is possible to get every last bit of infrequently accessed data into memory. The performance gain will be appear the second time an index or set of data is requested and this performance will last as long as the server is not rebooted in an x64 AD install.
Let’s see….loading AD into memory requires no user intervention and just works. That’s the kind of thing I like-no work for me, better performance.
That leaves a question for those out there who aren’t running x64 DCs. Why aren’t you?