Procedure cache which is also called as plan cache is primarily a cache of the query plans to improve performance of query execution. By caching the compiled and execution plans we don’t have compile query each time they are executed. This is a huge performance boost and is therefore a very critical and frequently used component in sql server. In addition to plans being cached, the procedure cache also contains algebrized views, xps (extended procs) and memory for inactive cursors.
The following caches are included in procedure cache –
Compiled Plans (CP)
Execution plans (MXC)
Algebrizer tree (ProcHdr)
Extended Procs (XPs)
The CPs, ProcHdrs and XPs are top level objects and are stored in separate cachestores. Cachestore is a new mechanism for caching things in Yukon. It has the benefits of having a uniform caching policy (in terms of costing, ageing and eviction policy) for all things that are cached using this infrastructure. Execute sys.dm_os_memory_cache_hash_tables to get a list of all the cachestores in the system. Execute sys.dm_Exec_Cached_Plans to get the contents of the plan cache.
MXCs and inactive cursors are cached as dependent objects in the respective compiled plans. They are stored in linked lists inside the compiled plans. Execute sys.dm_exec_cached_plan_dependent_objects to get a list of dependent MXCs and Cursors for a given CP.
When we compile a query, we generate a compiled plan (CP) for the query. We generate a compiled plan for the entire batch (not per statement). So if a batch has multiple queries in it then the CP will have the compiled query plans for all the queries in the batch. Each individual statement in the batch is represented by a CStmt and so a CP has an array of CStmts. Each CStmt actually contains the query plan for that particular statement. This array of CStmts is actually stored in a plan skeleton. A CP contains the plan skeleton, symbol table, parameter collection, top level memory object and MXC list among other things. Generally a CP is very valuable since the cost to compile a query is large. So we want to cache them and also want to ensure that under memory pressure these are not amongst the 1
entries to be evicted.
There are 2 types of CPs based on the type of query. If the query is dynamic sql or prepared, we have “sql CP”. For objects like triggers, store procs, functions etc we have the “obj CP”. We store the 2 types of CPs in separate cachestores. Thus we have 2 CP cachestores – one for sql CP and other for obj CP. A CP can be shared by multiple users.
During execution of a query, we generate an execution plan (MXC) from the CP for the entire batch. The individual compiled query plans in the CStmts get converted to the runtime query plans that are stored in the XStmts. Just like the MXCs being stored in a linked list in a CP, the XStmts are stored in a linked list inside a CStmt.
Execution plans (MXCs) are runtime objects and are associated with a CP. They contain runtime parameter and local variable info, runtime state like the currently executing stmt, object ids for objects created at runtime, top level memory object for holding the runtime information among other things. MXCs are single user i.e. if there are 3 users executing the same batch simultaneously then we will have 3 MXCs for the same CP. Thus there is a strict one to many relationship between the CPs and MXCs. MXCs cannot exist without CPs. Corresponding to the CPs, there are 2 types of MXCs – “sql MXC” and “obj MXC”. Since MXCs are dependent objects on the CPs, they don’t live in a separate cachestore but stay in their CP. There are 2 linked lists for MXCs. The 1
one is a lookup list and it stores the free (or not in use) MXCs for a CP. The 2
list is the enum list and it stores all the MXCs (both free and in use) for a CP. The lookup list is used to get the MXC memory to execute a batch. The enum list is used to enumerate all the MXCs associated with a compiled plan. This is useful for figuring out the total memory used by a batch among other things.
During query execution we 1
lookup a CP. If it doesn’t exist then we compile the query. Once we have the CP (either from cache or a newly generated one), we check to see if it has any free MXCs in the lookup list. If there aren’t any entries in the lookup list then we create one. If we have atleast one entry in the looup list then we remove it from the lookup list and use it. At the end of execution, we either destroy the MXC or return it to the free list depending on whether it needs to be cached or not.
The ProcHdr is created for procs, functions, triggers, constraints, defaults and views and contains the algebrized (or normalized) tree. However only prochdrs for views, constraints and defaults are cached. ProcHdrs are stored in a separate cachestore. The memory object for each prochdr is an incremental pmo of 8K. The hash table size of the prochdr cachestore is about 1/10
the hash table size of the CP cachestore.
XPs are predefined sps like sp_ExecuteSql, sp_CursorOpen, sp_TraceCreate etc and they contain the function name and the dll name of the implementation. They are stored in a separate cachestore. Since the size of the cache entry is very small, the memobj for each entry is 256 bytes and is incremental. The hash table size of the XP cachestore is 127.
Inactive cursors are cached in a compiled plan so that concurrent uses of cursors can reuse memory. Say we have a batch which declares and uses a cursor without deallocating it. If there are 2 users executing the batch then we will have 2 active cursors. Once the cursors are deallocated (potentially in different batch), we will cache the memory used to store the cursor instead of freeing it. This list of inactive cursors is kept in the CStmtCursorSelect (declare cursor stmt) in the CP. The next time someone executes the batch, we will reuse the cached cursor memory and initialize it appropriately to be used as an active cursor.
As mentioned before, inactive cursors are stored in a linked list in the corresponding CStmtCursorSelect in a CP. Just like MXCs, there are 2 lists here. The lookup list and the enum list. The lookup list only stores the inactive cursors. The enum list stores all cursors (both active and inactive) for the stmt and is used to enumerate all cursors for a given stmt (or CP).
We have the following memory objects for a batch –
Top level (8K) compiled plan pmo (CP_pmo). This also contains memory for the SE accessor cache and the TDS metadata cache. For single statement batches, there is no sub pmo created for the CStmt and the stmt memory allocations are done from the CP_pmo. The accessor cache and the TDS metadata cache are created lazily during execution, if required. Thus the CP_pmo also contains execution time memory and one can see its size grow after execution.
Top level (8K) MXC pmo (MXC_pmo). One pmo for each MXC associated with the CP. This pmo also contains the non recompilable XStmts in the batch.
Top level (8K) QueryContextForSE pmo. One pmo for each XStmt that is a query.
Sub pmo (512 bytes) per CStmt (except for single stmt batches). This is created from the CP_pmo.
Sub pmo (512 bytes) per recompilable XStmt. This is created from the CP_pmo. For NUMA machines, each XStmt pmo is a top level 8K pmo instead of being a sub pmo.
Sub pmo (512 bytes) for the plan skeleton associated with the CP. This is created from the CP_pmo.
Sub pmo (512 bytes) per cursor. This is created from the CP_pmo.
Each compiled plan contains a page allocator that is created from the cachestore. This page allocator is used to allocate all the top level 8K pmos associated with a batch. Thus both the compile time and execution time memory comes from the CP cachestore. The attached diagram (memory_layout.vsd) shows the memory layout that we have discussed so far.
Stay tuned for other posts which will build on this information.