Forum Discussion
MasterRedduke
Jan 23, 2024Copper Contributor
Python app on azure web app request concurrency
Spoiler
Hey, I have a question about python hosted on Azure, namely I have a problem with multithreading. It is an api in Flash, in site_config I set app_command_line to fire this app on gunicorn (from what I read there are more options but I chose this one because it was the fastest for me) . I also read that the maximum number of workers is 2*cpu + 1 which is 5 in my case, while I would like my server to be able to handle and 20 requests at the same time, is there any remedy for this?
site_config {
always_on = true
ftps_state = "Disabled"
app_command_line = "pip install -r requirements.txt && gunicorn --preload --bind=0.0.0.0 --timeout 600 app:app --workers=5 --threads=10 --worker-class=gthread"
#app_command_line = "pip install -r requirements.txt && gunicorn --preload --bind=0.0.0.0 --workers=4 --timeout 600 app:app"
application_stack {
python_version = 3.11
}1 Reply
You may scale up your application to handle more requests simultaneously:
1. Increase the Number of Threads
- Gunicorn supports threading, which allows each worker to handle multiple requests concurrently. You can configure this in your app_command_line by adding the --threads option:
gunicorn --workers=5 --threads=4 app:app- This configuration will allow each worker to handle 4 threads, resulting in a total of 20 concurrent requests (5 workers × 4 threads).
2. Use Asynchronous Frameworks
- If your application is I/O-bound (e.g., waiting for database queries or external API calls), consider using an asynchronous framework like FastAPI or Flask with asyncio. These frameworks can handle thousands of concurrent requests with fewer workers by leveraging non-blocking I/O.
3. Scale Out with Azure App Service
- Azure App Service allows you to scale out horizontally by adding more instances of your app. This can be configured in the Azure portal under Scale Out (App Service Plan). With multiple instances, you can distribute the load across them.
4. Optimize Gunicorn Configuration
- Fine-tune Gunicorn's configuration to match your workload. For example:
- Increase the timeout value if requests take longer to process.
- Use the --worker-class option to choose an appropriate worker type. For example:
- sync (default): Best for CPU-bound tasks.
- gevent or eventlet: Best for I/O-bound tasks.