May 29 2024 02:58 PM
Hi Azure Cosmos Db Team,
We haven't explicitly set retry policy in the event of throttling. Uses the default throttling retry policy.
Below as seen from diagnostics.
throttlingRetryOptions=RetryOptions{maxRetryAttemptsOnThrottledRequests=9, maxRetryWaitTime=PT30S}
However when we encountered actual throttling ("statusCode\":429,\"subStatusCode\":3200) we see in the diagnostics values increasing in multiples of 4 \"retryAfterInMs\":4.0 x-ms-retry-after-ms=4, \"retryAfterInMs\":8.0 x-ms-retry-after-ms=8 and resulting in Request rate is large. More Request Units may be needed, so no changes were made. Please retry this request later.
Can you please let me know the difference in behavior here(maxRetryWaitTime as shown in throttlingRetryOptions and retryAfterInMs in the diagnostics as seen above in the event pf throttling) ? I was expecting in the event of throttling the request will be retried after 30 seconds only based on throttlingRetryOptions setting? This is having a compounding effect in case of concurrent requests which affects overall throughput. We need to customize based on our requirement the retry no of times and interval in the event of throttling. Which parameter should we use for that?
With Regards,
Nitin Rahim
May 31 2024 08:27 PM
Jun 03 2024 05:25 AM
Jun 03 2024 05:51 AM
Jun 03 2024 09:52 AM
@nitinrahim `maxRetryWaitTime` in `ThrottlingRetryOptions` is an indicator of how long should the SDK wait before it stops retrying internally and returns the error back to the client. You can read about it in the Java docs on the `setMaxRetryWaitTime()` - ThrottlingRetryOptions (Azure SDK for Java Reference Documentation) (azuresdkdocs.blob.core.windows....
On the other hand, what you see in the diagnostics is the retry-after header, which is returned by the backend service to the client-side SDK as part of the internal retries to guide the client to retry after this time (usually in milliseconds and usually increases as multiples of 2) after throttling. Application or end user using SDK cannot control this internal header.
For concurrent requests, if you would like to keep retrying, you can increase the number of maxRetryAttempts, or you can also increase the maxRetryWaitTime so that the SDK will keep retrying. However, if throttling happens with multiple concurrent requests, then one way to resolve it is increasing the throughput of the database / container. Another way to solve this could be using the ThroughputControl mechanism built in the Java SDK, you can refer to the sample here - azure-cosmos-java-sql-api-samples/src/main/java/com/azure/cosmos/examples/throughputcontrol/async/Th...
Jun 03 2024 02:40 PM