Azure SQL Managed Instance is a fully-managed SQL Server database engine where you can easily choose the size of the instance with the amount of resources you want to use. Sometime, it is hard to determine what instance size you need to use for your workload and one of the criteria might be the amount of memory that you need. This article enables you to check do you have enough memory on your Azure SQL Managed Instance.
The amount of memory available to Managed Instance is proportional to the number of cores assigned. As an example, in the Gen5 architecture you have 5.1GB of memory per vCore, meaning that 8-core instance will have 41GB memory. Periodically you should check if this amount of memory is the right match for your workload.
Do not be tempted to just monitor the Managed Instance use. It will be about 100% of available memory. Some people believe that this is an issue (like hitting 100% CPU), but this is expected and desired behavior. Managed Instance should use as much memory as possible to cache the pages from disk into the buffer pool. The only case where you will not see near 100% usage of memory is the case where you have databases much smaller than the available memory size so all of them fit into memory.
The better metric for memory usage monitoring is ‘Page Life Expectancy’ (PLE) performance counter monitored on NUMA nodes. Page Life Expectancy indicates how many seconds the pages live in the memory before they are flushed back to disk. The higher this value is, the better, but you need to determine what the limit is.
Some people believe that PLE value should be greater than 300, but this is a old rule created for computers with 4GB memory and criticized by people like Jonathan Kehayias. In modern architectures with more memory, you should increase this limit and find the value proportional to the buffer pool memory. As an example, if you have 8-core Managed Instance (with ~40GB memory) and if the buffer pool uses all of the 40GB memory (which is highly unlikely because memory should be allocated to other processes and components) your PLE might go up to 3000 instead of 300.
There is an article written by David Williams and Matthew Robertshaw explaining how to determine limits and calculate minimal page life expectancy using performance counters. The formula they provided is used in this SQL query:
bp_mem_gb = (l.cntr_value*8/1024)/1024,
ple = v.cntr_value,
min_ple = (((l.cntr_value*8/1024)/1024)/4)*300
from sys.dm_os_performance_counters v
join sys.dm_os_performance_counters l on v.object_name = l.object_name
where v.counter_name = 'Page Life Expectancy'
and l.counter_name = 'Database pages'
and l.object_name like '%Buffer Node%'
This query will return the current value of PLE, the amount of buffer pool memory, and the minimal PLE value calculated based on the buffer pool memory value.
If you run this query on your Managed Instance and you are frequently getting PLE below minimal PLE value, you might consider upscaling your instance to a higher tier.
NOTE: Try to identify the queries that are using a lot of memory before you add more cores. It is possible that there are some queries that are cause memory pressure or use more memory than usual. If you identify and fix these queries you would not need to up-scale your instance.
You don't need to immediately upgrade service tier and add more memory as soon as you identify this issue. This parameter is not the exact proof that you need more memory, but it is good indicator and you should investigate it. You should check wait statistics using DMVs like it is shown in the following code:
DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR);
-- wait some time
select top 10 wait_type, wait_time_ms from sys.dm_os_wait_stats order by wait_time_ms desc
In addition, you can check memory distribution and see whether or components use more memory than expected. It is possible that there is some other component such as plan cache or columnstore pool that is using more memory and removing the pages from the buffer pool. There are other parameters that you can monitor and you can find more information here.