Welcome to Part Four of our Server Hang troubleshooting series. Today we are going to discuss PTE depletion and Low Physical Memory conditions and how those two issues can lead to server hangs. In our
post on the /3GB switch
we mentioned that in general, a system should always have around 10,000 free System PTE’s. Although we normally see PTE depletion issues on systems using the /3GB switch, that does not necessarily mean that using the /3GB switch is going to cause issues – what we said was that the /3GB switch is intended to be used in very specific instances. Tuning the memory further by using the USERVA switch in conjunction with the /3GB switch can often stave off PTE depletion issues. The problem with PTE depletion is that there are no entries logged in the Event Viewer that indicate that there is a resource issue. This is where using Performance Monitor to determine whether a system is experiencing PTE depletion comes into play. However, Performance Monitor may not identify why PTE’s are being depleted. In instances where a process has a continually rising handle count that mirrors the rate of PTE depletion, it is fairly straightforward to identify the culprit. However, more often than not we have to turn to a complete dump file to analyze the problem.
Below is what we might see in a dump file in a scenario where we have PTE depletion when we use the
command to get an overview of Virtual Memory Usage:
In this particular instance we can clearly see that we have a low PTE condition. In looking at the Virtual Memory Usage summary, we can see that the server is most likely using the /3GB switch, since the NonPaged Pool Maximum is only 130MB. In this scenario we would want to investigate using the USERVA switch to fine tune the memory and recover some more PTE’s, If USERVA is already in place and set to 2800, then it is time to think about scaling the environment to spread the server load. For more granular troubleshooting, where we suspect a PTE leak that we cannot explain using Performance Monitor data, we can modify the registry to enable us to track down the PTE leak. The registry value that we need to add to the
HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management
key is as follows:
Value Type: REG_DWORD
Value Data: 1
Once we implement this registry modification we need to reboot the system to enable the PTE Tracking. Once PTE Tracking is in place, we would need to capture a new memory dump the next time the issue occurs and analyze that dump to identify the cause of the leak.
To wrap up our post, we are going to take a quick look at a dump file of a server that is experiencing a low physical memory condition. Below is the output of the
command (with a couple of comments that we’ve added in)
In this particular instance, the server simply did not have enough memory to keep up with the demands of the processes and the OS. Paged and NonPaged Pool resources are not experiencing any issues. The number of available PTE’s is somewhat lower than our target of 10,000. However, if you recall from our earlier posts, if a server is under load, the number of Free PTE’s may drop below 10,000 temporarily. In this case, as a result of the low memory condition on this server there were several threads in a WAIT state – which caused the server to hang. The solution for this particular issue was to add more physical memory to the server to ease the low physical memory condition.
And with that, we come to the end of this post. Hopefully you’ve found the information in our last few posts useful.