See how dynamic concurrency works in Azure Function App with a simple test

Published Jun 14 2022 12:25 AM 1,499 Views
Microsoft

 

In May, dynamic concurrency(denoted as DC in this blog) in Azure Functions became generally available. By enabling dynamic concurrency, the platform can adjust the concurrency of functions dynamically in the condition that the worker instance is healthy(like CPU and thread utilization is healthy). You can refer to more information about dynamic concurrency in this link: https://docs.microsoft.com/en-us/azure/azure-functions/functions-concurrency#dynamic-concurrency.

 

In this blog, I am writing to introduce how DC works in Azure function apps with some tests.

 

How to enable Dynamic Concurrency?

By default, dynamic concurrency is disabled. With dynamic concurrency enabled, concurrency starts at 1 for each function, and is adjusted up to an optimal value, which is determined by the host.

 

You can enable dynamic concurrency in your function app by adding the following settings in your host.json file:

 

{
    "version": "2.0",
    "concurrency": {
        "dynamicConcurrencyEnabled": true,
        "snapshotPersistenceEnabled": true
    }
}

 

 

When SnapshotPersistenceEnabled is true, which is the default, the learned concurrency values are periodically persisted to storage so new instances start from those values instead of starting from 1 and having to redo the learning.

 

What kind of triggers does DC support?

Dynamic concurrency is currently only supported for the Azure Blob, Azure Queue, and Service Bus triggers and requires you to use  version 5.x of the storage extension, and version 5.x of the Service Bus extension.

 

How to see Dynamic Concurrency’s adjusting logs?

You need to set the logLevel of ‘Host.Concurrency’ to ‘Trace’ in host.json to enable the logging of dynamic concurrency.

 

{
    "version": "2.0",
    "logging": {
        "logLevel": {
          "Host.Concurrency": "Trace"
        }
    },
    "concurrency": {
        "dynamicConcurrencyEnabled": true,
        "snapshotPersistenceEnabled": true
    }
}

 

 

Then you can see the dynamic concurrency logs in filesystem log or applicationinsight log like below. In the below example, the platform saw the CPU load was low, then it decided to increase the concurrency value of function ‘SBQueueFunction1’ to 138.

 

 

2022-06-13T13:19:17.078 [Debug] [HostMonitor] Host process CPU stats (PID 6576): History=(18,48,29,39,37), AvgCpuLoad=34, MaxCpuLoad=48

2022-06-13T13:19:17.078 [Debug] [HostMonitor] Host aggregate CPU load 34

2022-06-13T13:19:17.078 [Debug] FunctionApp7.SBQueueFunction1.Run Increasing concurrency

2022-06-13T13:19:17.078 [Debug] FunctionApp7.SBQueueFunction1.Run Concurrency: 138, OutstandingInvocations: 135

 

 

 

Where is concurrency value stored?

The concurrency value of each function is stored in the storage account specified in appsetting ‘AzureWebjobsStorage’. The values are stored in file ‘azure-webjobs-hosts / concurrency / functionApp_Name / concurrencyStatus.json’.

zhuyue_1-1655171078152.png

 

The below file means that from the timestamp to now, the concurrency value of function ‘SBQueueFunction1’ is 171, and the concurrency value of ‘BlobFunction1’is 90.

 

 

{"Timestamp":"2022-06-13T13:19:55.6526064Z","NumberOfCores":1,"FunctionSnapshots":{"FunctionApp7.BlobFunction1.Run":{"Concurrency":90},"FunctionApp7.SBQueueFunction1.Run":{"Concurrency":171}}}

 

 

If you set ‘snapshotPersistenceEnabled’ to true in host.json, the platform will read the current concurrency value from this file. And when the platform decides to increase or decrease the value, it will write back to this file to change the value.

 

Test of the Dynamic concurrency:

Testing environment:

  1. Azure function app in B1 tier with only 1 instance. I wrote C# codes from local VS then published to the function app.
  2. To avoid noise, in the app service plan I only have this function app and only 1 testing function.
  3. I have a simple service bus queue trigger in my function app, the codes are like below. It just consumes some CPU then wait 1.5 seconds before return.

 

 

public class SBQueueFunction1
    {
        [FunctionName("SBQueueFunction1")]
        public async Task Run([ServiceBusTrigger("queue2", Connection = "sbconnection")] string myQueueItem, ILogger log)
        {
            log.LogInformation($"C# ServiceBus queue trigger function processed message: {myQueueItem}");

            double a = 333.33;
            double b = 444.44;
            double c = 0;

            for (int i = 0; i < 10000; i++)
            {
                c = a * b;
                a = a + 0.1;
            }

            await Task.Delay(1500);
        }
    }

 

 

4. Each time, I sent a batch of 4000 messages to the service bus queue to let the function to be triggered 4000 times.(Why 4000? Because the maximum size of my service bus batch is 4500)  If you don’t know how to send messages to the service bus queue, you can refer to: https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-get-started-with-que....

 

Tests:

 

Test scenario 1:

I set ‘snapshotPersistenceEnabled’ to false in host.json.

Finding:

In the function log I saw the original concurrency value was 1, and it was increased many times. When the 4000 executions finished, the concurrency was 137. In total, the 4000 executions took 118 seconds to finish all.

 

Test scenario 2:

I set ‘snapshotPersistenceEnabled’ to true in host.json.

Finding:

This test result is almost same to test 1. The concurrency value was increased from 1 to 137.

 

Test scenario 3:

Set ‘snapshotPersistenceEnabled’ to true in host.json. I sent many batches(each has 4000 messages) to the service bus queue in many rounds constantly. And I recorded the change of concurrency value and time taken of each round.

Finding:

In this test, the records as below. It looks that the total time was decreasing in each round and the concurrency value converged to 311.

 

 

Starting value of concurrency

Ending value of concurrency

Time taken(seconds)

Round 1

1

137

118

Round 2

137

171

48

Round 3

171

211

40

Round 4

211

233

33

Round 5

233

267

31

Round 6

267

289

28

Round 7

289

305

27

Round 8

305

311(max value 312)

27

Round 9

311

311(no change)

26

 

Test scenario 4:

I set the original concurrency value to 500 in azure-webjobs-hosts / concurrency / functionApp_Name / concurrencyStatus.json’ while ‘snapshotPersistenceEnabled’ is true in host.json. Then I sent batches of 4000 messages to the service bus queue continuously(sent over 50000 messages in total).  Also observe the concurrency value and time taken.

Finding:

The concurrency value decreased due to high CPU  or thread starvation as below. Also the total time of 4000 executions was always longer than 26 seconds. At last, the concurrency value also converged to around 320.

 

 

2022-06-13T15:52:12.598 [Debug] [HostMonitor] Host process CPU stats (PID 5760): History=(84,88,89,89,89), AvgCpuLoad=88, MaxCpuLoad=89

2022-06-13T15:52:12.598 [Debug] [HostMonitor] Host aggregate CPU load 88

2022-06-13T15:52:12.598 [Warning] [HostMonitor] Host CPU threshold exceeded (88 >= 80)

2022-06-13T15:52:12.598 [Warning] Possible thread pool starvation detected.

2022-06-13T15:52:12.599 [Debug] FunctionApp7.SBQueueFunction1.Run Decreasing concurrency (Enabled throttles: CPU,ThreadPoolStarvation)

2022-06-13T15:52:12.599 [Debug] FunctionApp7.SBQueueFunction1.Run Concurrency: 427, OutstandingInvocations: 429

 

 

 

Conclusions:

  1. While enabling dynamic concurrency, the concurrency process manager will adjust the concurrency value gradually by monitoring instance health metrics, like CPU and thread utilization, and changes throttles as needed.
  2. If the instance’s all metrics are healthy, when the concurrency value is small, the platform will increase this value. And when the instance is not healthy, the concurrency manager will decrease the concurrency value. The concurrency value will converge to a numerical interval. In my test, the interval is 310 to 320.
  3. Set the starting concurrency value too high is not a good choice. It might cause high CPU and make the executions slower.

 

 

Co-Authors
Version history
Last update:
‎Jun 17 2022 01:34 AM
Updated by: