🚀 Scaling Dynamics 365 CRM Integrations in Azure: The Right Way to Use the SDK ServiceClient

PravinT

Microsoft

Aug 25, 2025

When integrating Dynamics 365 CRM with Azure—whether through App Services or Azure Functions—developers often rely on the ServiceClient from the CRM SDK. While powerful and async-friendly, misuse in high-load environments can lead to serious scalability and reliability issues.

This blog explores common pitfalls and presents a scalable pattern using the .Clone() method to ensure thread safety, avoid redundant authentication, and prevent SNAT port exhaustion.

🗺️ Connection Factory with Optimized Configuration

The first step to building a scalable integration is to configure your ServiceClient properly. Here's how to set up a connection factory that includes all the necessary performance optimizations:

public static class CrmClientFactory {
    private static readonly ServiceClient _baseClient;

    static CrmClientFactory() {
        ThreadPool.SetMinThreads(100, 100); // Faster thread ramp-up
        ServicePointManager.DefaultConnectionLimit = 65000; // Avoid connection bottlenecks
        ServicePointManager.Expect100Continue = false; // Reduce HTTP latency
        ServicePointManager.UseNagleAlgorithm = false; // Improve responsiveness

        _baseClient = new ServiceClient(connectionString);
        _baseClient.EnableAffinityCookie = false; // Distribute load across Dataverse web servers
    }

    public static ServiceClient GetClient() => _baseClient.Clone();
}

❌ Anti-Pattern: One Static Client for All Operations

A common anti-pattern is to create a single static instance of ServiceClient and reuse it across all operations:

public static class CrmClientFactory {
    private static readonly ServiceClient _client = new ServiceClient(connectionString);
    public static ServiceClient GetClient() => _client;
}

This struggles under load due to thread contention, throttling, and unpredictable behavior.

⚠️ Misleading Fix: New Client Per Request

To avoid thread contention, some developers create a new ServiceClient per request, however the below does not truly create seperate connection unless RequireNewInstance=True connection string param or useUniqueInstance:true constructor param are utilized. Many a times these intricate details are missed out and causing same connection be shared across threads with high lock times compounding overall slowness.

public async Task Run(HttpRequest req) {

    var client = new ServiceClient(connectionString);

    // Use client here

}

Even with above flags there is a risk of auth failures and SNAT exhaustion in a high throughout service integration scenario due to repeated OAuth authentication every time a ServiceClient instance is created with constructor.

✅ Best Practice: Clone Once, Reuse Per Request

The best practice is to create a single authenticated ServiceClient and use its .Clone() method to generate lightweight, thread-safe copies for each request:

public static class CrmClientFactory {
    private static readonly ServiceClient _baseClient = new ServiceClient(connectionString);
    public static ServiceClient GetClient() => _baseClient.Clone();
}

Then, in your Azure Function or App Service operation:

❗ Avoid calling the factory again inside helper methods. Clone once and pass the client down the call stack.

public async Task HandleRequest() {
    var client = CrmClientFactory.GetClient(); // Clone once per request
    await DoSomething1(client);
    await DoSomething2(client);
}

public async Task DoSomething1(ServiceClient client) {
    await client.Create(); // Avoid new client cloning but just use passed down client as is
}

🧵 Parallel Processing with Batching

When working with large datasets, combining parallelism with batching using ExecuteMultiple can significantly improve throughput—if done correctly.

🔄 Common Mistake: Dynamic Batching Inside Parallel Loops

Many implementations dynamically batch records inside Parallel.ForEach, assuming consistent batch sizes. But in practice, this leads to:

Inconsistent batch sizes (1 to 100+)
Unpredictable performance
Difficult-to-analyze telemetry

✅ Fix: Chunk Before You Batch

        public static List> ChunkRecords(List records, int chunkSize)
        {
            return records
            .Select((record, index) => new { record, index })
            .GroupBy(x => x.index / chunkSize)
            .Select(g => g.Select(x => x.record).ToList())
            .ToList();
        }

        public static void ProcessBatches(List records, ServiceClient serviceClient, int batchSize = 100, int maxParallelism = 5)
        {

            var batches = ChunkRecords(records, batchSize);

            Parallel.ForEach(batches, new ParallelOptions { MaxDegreeOfParallelism = maxParallelism }, batch => {
                using var service = serviceClient.Clone(); // Clone once per thread
                var executeMultiple = new ExecuteMultipleRequest
                {
                    Requests = new OrganizationRequestCollection(),
                    Settings = new ExecuteMultipleSettings
                    {
                        ContinueOnError = true,
                        ReturnResponses = false
                    }
                };

                foreach (var record in batch)
                {
                    executeMultiple.Requests.Add(new CreateRequest { Target = record });
                }

                service.Execute(executeMultiple);

            });

        }

🚫 Avoiding Throttling: Plan, Don’t Just Retry

While it’s possible to implement retry logic for HTTP 429 responses using the Retry-After header, the best approach is to avoid throttling altogether.

✅ Best Practices

Control DOP and batch size: Keep them conservative and telemetry driven.
Use alternate app registrations: Distribute load across identities but do not overload the Dataverse org.
Avoid triggering sync plugins or real-time workflows: These amplify load.
Address long-running queries: Optimize operations with Microsoft support help before scaling
Relax time constraints: There’s no need to finish a job in 1 hour if it can be done safely in 3.

🌐 When to Consider Horizontal Scaling

Even with all the right optimizations, your integration may still hit limits under the HTTP stack—such as:

WCF binding timeouts
SNAT port exhaustion
Slowness not explained by Dataverse telemetry

In these cases, horizontal scaling becomes essential.

App Services: Easily scale out using auto scale rules.
Function Apps (service model): Scale well with HTTP or service bus triggers.
Scheduled Functions: Require deduplication logic to avoid duplicate processing.
On-Premises VM: When D365 SDK based integrations hosted on VM infra, they shall need horizontal scaling by increasing servers.