First published on TechNet on Mar 09, 2014
Hello all,
Jesse Esquivel here again with another post I hope you find useful. This post is a great complement to Jerry Devore's post on diagnosing a leak in non paged pool using event viewer, poolmon, and perfmon! Today I’m going to talk about analysis of a leak in paged pool kernel memory and the tools and methods used to diagnose and find root cause of the issue.
A memory leak occurs when software (drivers) make kernel memory allocations and never free them, over time this can deplete kernel memory. Though paged pool depletion takes considerably more effort on an x64 based system it’s not impossible or unheard of for it to happen and cause a server to go down or into a hard hang state. Sometimes a low virtual memory condition can cause the operating system to become unstable and hang.
The Victim
The server we are looking at here is a virtual machine running 2008 R2 SP1. After about 90 hours or 4-5 days the server would become unresponsive, go into a hard hang state, and the services it was hosting would be unavailable necessitating a reboot to restore functionality. Rinse and repeat until doomsday. Like clockwork, this would happen every 5-7 days and the server would need to be rebooted again.
First things first. Since it’s in a hard hang state you actually can’t get into the server until after it was rebooted so the investigation started when the server wasn’t actually exhibiting the problem… yet . All we have to go on was the fact that the box would go belly up almost weekly like clockwork. Task Manager. A great place to start and get a very quick at a glance view of the health of the server. Having seen this strange behavior before I suspected a leak in kernel memory but alas we are not in the business of speculation. My suspicion zeroed me in on Kernel memory consumption, and handle count as seen here (this is just a shot of a random vm for reference):
I took note of the values for Kernel memory and the overall handle count of the system. Since everything appeared to be operating normally at the moment I decided to do some post mortem investigation. I reviewed the event logs around the time just before the server was rebooted. What I found was an entry in the System event log for Resource-Exhaustion Detector, event ID 2004 . This indicated that the server was low on virtual memory (kernel) during the time it was in the hard hang state. This appears to back up my suspicion of an issue with Kernel memory on the box being depleted.
In a few hours I checked back with the server and saw that the paged pool kernel memory had increased, as well as the overall handle count for the system. To see what process has the highest handle count we can add the “Handles” column to task manager by clicking View | Select Columns and then sorting by it.
Lsass.exe had the most number of handles of any process, but was still a fairly low count. As a general rule of thumb, any process with a handle count higher than 10k should be investigated for a possible leak. I took note of the system time and the number of handles for lsass.exe. At this point we have a server with increasing paged pool consumption, increasing overall handle count, and Event ID 2004s in the system event log during the time the server was in a hard hang. These are all classic indicators of an issue with kernel memory depletion.
Now to find out what is consuming paged pool. There are two tools that we will use to analyze kernel memory consumption, Poolmon and Windows Performance Recorder.
Tools of the Trade
Poolmon
Poolmon is a tool that can be used to view kernel memory pools and their consumption. A great explanation of poolmon and how it works can be found here . In the aforementioned poolmon link, note the explanation on how pool tags work – a pool tag is a four-letter string that’s used to label the pool allocation and should be unique to each driver that is making the allocation (keep this in mind – more on this later). Poolmon is included in the Windows Driver Kit . Essentially using poolmon we can identify which pool tags are making the paged pool allocations and are not freeing them. Here is a sample screen shot of poolmon:
Windows Performance Recorder
The Windows Performance Recorder is part of the Windows Performance Toolkit whose latest version is available in the Windows 8.1 SDK. It’s a very powerful tool that can be used for performance analysis, debugging, and even for diagnosing memory leaks! Windows performance recorder takes event trace logs based on profiles. The pool usage profile can be used to log pool consumption in WPRUI:
Data gathering and Analysis
Now armed with poolmon and WPR, we can set the logging we need to investigate further. I set poolmon to continuously log on the server in question. Here is what the poolmon data would look like for this scenario at the start of the logging, we can see that the system has only been up for a few minutes, has an overall 25k handle count, and is at 120MB of paged pool consumption so all is well at the moment.
Now we wait for some time to pass as we log the increases in paged pool kernel memory. After some time while the server is still responsive I login and see the handle count for the lsass.exe process has increased steadily with the paged pool consumption, we’ll say it had ~26,000 when I checked it. Doomsday arrives and the server goes into a hard hang. It’s rebooted and we review the poolmon logs after it comes back up. After 89 hours uptime we have 64,592 handles on the system, and wait for it…we are at 2.9GB of paged pool consumption! Note the pool tag with the highest paged pool consumption is “Toke” with a note in the mapped driver column of “nt!se Token objects.” These are security token objects – which are created and managed by the lsass.exe process.
There is a lot going on here with this data, here’s how to make sense of the columns in Poolmon. We aren’t interested in legit allocations that are eventually freed, what we are interested in is outstanding allocations or allocations that have not yet been freed. The Diff column gives us exactly that, it tells us how many outstanding allocations we have, in this case 1,394,724 to be exact. The “Diff” column is actually the “Allocs” column minus the “Frees” column. So Alloc (127664567) – Frees (10771843) = Diff (1394724). For the Toke row, the Bytes column shows that this pool tag has consumed 2.3GB of paged pool. So of the 2.9GB total paged pool that is depleted on this system, 2.3GB of it are outstanding allocations made by the “Toke” pool tag.
The plot thickens
So now we know we have a problem with paged pool kernel memory depletion. We know that the Toke pool tag is the top consumer of paged pool. But why? We know that the lsass.exe process has a very large number of leaked handles. We can certainly draw the line and connect the Toke pool tag to the leaked handles in the lsass.exe process (since they are security token objects), but how can we verify this? Thanks to Mark Russinovich for creating the very useful handle.exe utility. Handle.exe will display all open handles for all processes on the system. As an example, if we run handle –s it will give us a count summary of all open handles, note the Token handles.
So let’s confirm that all of these leaked handles in lsass.exe are in fact for security token objects. Again we wait some time where we have enough leaked handles in lsass but the server is still operational in order to investigate. You can do a summary count by process with handle.exe, here it shows we have 24,238 handles in lsass.exe to security token objects:
We can also use handle.exe to dump all of the handles for the lsass process:
As you can see we have a large number of handles to security token objects which confirms what these leaked handles are for. So we have now confirmed the leaked handles in lsass are for security token objects, which match up to the Toke pool tag as the top consumer of paged pool. At this point we now need to find out why lsass is leaking so many handles for security token objects? Note the identity for the Token objects in the above screen shot, this is a domain service account for a third party software that runs under this identity. Turn off the software and the leak goes away. Is the third party the culprit? First instinct is yes it is, but there’s more than meets the eye here. This software is running against every server in the enterprise, but only this one leaks. This is enough to rule out the software for now and we keep digging.
Switch gears to Windows Performance Recorder
Since we have this powerful tool now in our arsenal we fire it up and begin logging pool usage with it. To start an ETW trace first we need to install the Windows Performance Toolkit on the server or optionally install it on another machine or workstation and copy the “C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\Redistributables” folder to the server and install the appropriate MSI file. Then we launch WPRUI as an administrator, click the pool usage profile, set the options like so and start logging:
Here is an excellent post on how to setup WPR to log pool and create custom profiles, in fact it’s the one I used to fire up the tracing! Now we wait for some time to pass as we log the increases in paged pool kernel memory. Since we are now armed with a lot more data and have confirmed there is a leak I stop the trace after about 20 minutes. The data is located in the location that you saved the ETL file:
Note: There are differences in the Save and Cancel button in WPRUI. The save button above actually saves the ETL but the trace continues to run. In order to actually stop the trace on the former screen above you need to click cancel .
We use the Windows Performance Analyzer (WPA) to view the data in the event trace log (ETL) file. You can copy the file off of the server to a machine that has internet access and load WPA to analyze it there. Be sure to click Trace | Load Symbols in order to view stack data. Please see the MSDN WPR blog post on how to configure symbols, your analysis machine will need internet access.
There is a lot going on in WPA when you open an ETL, and it can be overwhelming at times. It’s easy to get lost in the good amount of data in there! I’ve arranged my data according to the good folks over on the MSDN blog . Arranging your data in the table is key to analyzing it in WPA. In the graph explorer drag over the cumulative count by paged, Tag graph to the right pane for analysis. Everything that is important to you should be to the left of the gold bar in the table and in the order you want to sort it by. Here we have Type, Paged, Pool Tag, Stack, and size.
Couple of things here, first note the graphical representation of paged pool allocations – a steady increase that never drops. Now we start from the left inside the table. AIFO = Allocated Inside, Freed Outside – this is what we want, basically these are outstanding memory allocations that are never freed or are freed outside of the WPR trace. The leak is in paged pool so we move forward. Next column is Pool Tag, and thanks to poolmon we know it’s “Toke” so we key on that. The stack column is empty for some reason in this trace so we won’t be able to see the functions that are called to allocate pool (we’ll look at a different graph later). Then I added the size column, take note the highlighted size value of 1664 . If you remember our poolmon data from earlier we know that the outstanding allocations are 1664 bytes:
So this data correlates to past behavior that we have seen with poolmon. Back to the WPA graph. Notice the count column and row 4. We have 6,189 outstanding allocations. Count column row 8 shows that we have 6,043 allocations at a size of 1664 bytes. Essentially the majority of the outstanding allocations are of size 1664 bytes, which indicates that these are likely the leaked security token objects. Let’s look at some more data in the analyzer pane. Expand the System Activity collection in the graph explorer pane on the left. Click on the “Stacks” graph and drag and drop it onto the right pane:
In the right pane at the top, click the “Display graph and table button:
In the middle pane just to the left of the gold bar right click and select the following columns: Provider Name, Process Name, Stack, Event Name, Count Pool: Allocate, Count Pool: Free, Version.
Now to the left of the gold bar arrange the columns in the following order starting from left to right: Provider Name, Process Name, Stack, Event Name, Count Pool: Allocate, Count Pool: Free, Version. Remember we want everything that is important to us to the left of the gold bar so that we can sort on it the way we like. This is a very important rule of using WPA and xperfview! So now we have everything arranged like so:
From the left we have Provider Name, Process Name, Stack, Event Name, and Count Pool: Allocate. We expand into the stack to find ntkrnlmp.exe!SepDuplicateToken and ntkrnlmp.exe!SepDuplicateToken <itself>, followed by 580 pool allocations in the event column. Walking down into the stack you see the vicious pattern repeating itself with no end in sight:
Indeed we can see that instead of re-using existing security token objects, a new one is created each time something is connecting to the machine. Correlating this with the data we’ve found from the dumping all of the lsass.exe handles we know that each time the service account authenticates to the server a new security token object is created, coupled with a frequent connection interval you can see how this can get out of hand quickly!
The Culprit
Lsass.exe binaries up to date. Check. Third party software up to date. Check. Increasing the third party software connection interval only puts off the inevitable. Third party software is agentless, it is merely authenticating to the box to gather information at an interval. We know that some driver is making these pool allocations, so we set out to review all software installed on the system, specifically asking questions about any security focused software since after all we have leaked token objects via lsass.exe. After talking with the administrators for a bit it was revealed they were running some security software. They turned off the software and the leak vanished, even with the high connection rate of the third party software. We engaged the vendor and provided the data to them. Don’t be afraid to engage other third party vendors and send them your ETL file so that they can take a look at it with their private symbols. They found the problem was in one of their drivers that was causing the leak and sometime later provided an updated driver. Due to the nature of these types of software’s it can be difficult to find the smoking gun or driver that is bringing the pain, surely to the untrained eye it looked like a leak in the lsass.exe process, especially since the Toke pool tag was the top consumer and ntkrnlmp.exe was making the pool allocations. So if debugging dumps is not your thing, there are plenty of tools to put in your arsenal to help you diagnose a pool memory leak. Having a diverse toolset can allow you to gather different types of data, correlate and re-enforce data points, and uncover things that you normally wouldn’t with just one tool. Using the right tools to uncover forensic data can sometimes lead you to the culprit without actually having to debug a dump!
Until next time!
Jesse "Hit-Man" Esquivel
Updated Feb 20, 2020
Version 4.0Mark Morowczynski
Microsoft
Joined February 01, 2017
Core Infrastructure and Security Blog
Follow this blog board to get notified when there's new activity