Forum Discussion
MasterRedduke
Jan 23, 2024Copper Contributor
Python app on azure web app request concurrency
[ Spoiler ]
Kidd_Ip
Mar 16, 2025MVP
You may scale up your application to handle more requests simultaneously:
1. Increase the Number of Threads
- Gunicorn supports threading, which allows each worker to handle multiple requests concurrently. You can configure this in your app_command_line by adding the --threads option:
gunicorn --workers=5 --threads=4 app:app
- This configuration will allow each worker to handle 4 threads, resulting in a total of 20 concurrent requests (5 workers × 4 threads).
2. Use Asynchronous Frameworks
- If your application is I/O-bound (e.g., waiting for database queries or external API calls), consider using an asynchronous framework like FastAPI or Flask with asyncio. These frameworks can handle thousands of concurrent requests with fewer workers by leveraging non-blocking I/O.
3. Scale Out with Azure App Service
- Azure App Service allows you to scale out horizontally by adding more instances of your app. This can be configured in the Azure portal under Scale Out (App Service Plan). With multiple instances, you can distribute the load across them.
4. Optimize Gunicorn Configuration
- Fine-tune Gunicorn's configuration to match your workload. For example:
- Increase the timeout value if requests take longer to process.
- Use the --worker-class option to choose an appropriate worker type. For example:
- sync (default): Best for CPU-bound tasks.
- gevent or eventlet: Best for I/O-bound tasks.