This article introduces several common categories of such problems and explains how to troubleshoot them. We will demonstrate them using Python and popular AI-related packages, as these tend to exhibit compatibility-related behavior.
TOC
- Introduction
- Environment Variable
- Build Time
- Compatible
- Memory
- Conclusion
1. Introduction
One of the most common issues during project development is the scenario where “the application runs perfectly in the local environment but fails after being deployed to Azure.”
In most cases, deployment logs will clearly reveal the problem and allow you to fix it quickly.
However, there are also more complicated situations where "due to the nature of the error itself" relevant logs may be difficult to locate.
This article introduces several common categories of such problems and explains how to troubleshoot them.
We will demonstrate them using Python and popular AI-related packages, as these tend to exhibit compatibility-related behavior.
Before you begin, it is recommended that you read Deployment and Build from Azure Linux based Web App | Microsoft Community Hub on how Azure Linux-based Web Apps perform deployments so you have a basic understanding of the build process.
2. Environment Variable
Simulating a Local Flask + sklearn Project
First, let’s simulate a minimal Flask + sklearn project in any local environment (VS Code in this example).
For simplicity, the sample code does not actually use any sklearn functions; it only displays plain text.
app.py
from flask import Flask
app = Flask(__name__)
@app.route("/")
def index():
return "hello deploy environment variable"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8000)
We also preset the environment variables required during Azure deployment, although these will not be used when running locally.
.deployment
[config]
SCM_DO_BUILD_DURING_DEPLOYMENT=false
As you may know, the old package name sklearn has long been deprecated in favor of scikit-learn.
However, for the purpose of simulating a compatibility error, we will intentionally specify the outdated package name.
requirements.txt
Flask==3.1.0
gunicorn==23.0.0
sklearn
After running the project locally, you can open a browser and navigate to the target URL to verify the result.
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py
Of course, you may encounter the same compatibility issue even in your local environment.
Simply running the following command resolves it:
export SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True
We will revisit this error and its solution shortly.
For now, create a Linux Web App running Python 3.12 and configure the following environment variables.
We will define Oryx Build as the deployment method.
SCM_DO_BUILD_DURING_DEPLOYMENT=false
WEBSITE_RUN_FROM_PACKAGE=false
ENABLE_ORYX_BUILD=true
After deploying the code and checking the Deployment Center, you should see an error similar to the following.
From the detailed error message, the cause is clear:
sklearn is deprecated and replaced by scikit-learn, so additional compatibility handling is now required by the Python runtime.
The error message suggests the following solutions:
-
Install the newer scikit-learn package directly.
-
If your project is deeply coupled to the old sklearn package and cannot be refactored yet, enable compatibility by setting an environment variable to allow installation of the deprecated package.
Typically, this type of “works locally but fails on Azure” behavior happens because the deprecated package was installed in the local environment a long time ago at the start of the project, and everything has been running smoothly since.
Package compatibility issues like this are very common across various languages on Linux.
When a project becomes tightly coupled to an outdated package, you may not be able to upgrade it immediately.
In these cases, compatibility workarounds are often the only practical short-term solution.
In our example, we will add the environment variable:
SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True
However, here comes the real problem:
This variable is needed during the build phase, but the environment variables set in Azure Portal’s Application Settings only take effect at runtime. So what should we do?
The answer is simple, shift the Oryx Build process from build-time to runtime.
First, open Azure Portal → Configuration and disable Oryx Build.
ENABLE_ORYX_BUILD=false
Next, modify the project by adding a startup script.
run.sh
#!/bin/bash
export SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py
The startup script works just like the commands you run locally before executing the application.
The difference is that you can inject the necessary compatibility environment variables before running pip install or starting the app.
After that, return to Azure Portal and add the following Startup Command under Stack Settings.
This ensures that your compatibility environment variables and build steps run before the runtime starts.
bash run.sh
Your overall project structure will now look like this.
Once redeployed, everything should work correctly.
3. Build Time
Build-Time Errors Caused by AI-Related Packages
Many build-time failures are caused by AI-related packages, whose installation processes can be extremely time-consuming.
You can investigate these issues by reviewing the deployment logs at the following maintenance URL:
https://<YOUR_APP_NAME>.scm.azurewebsites.net/newui
Compatible
Let’s simulate a Flask + numpy project.
The code is shown below.
app.py
from flask import Flask
app = Flask(__name__)
@app.route("/")
def index():
return "hello deploy compatible"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8000)
We reuse the same environment variables from the sklearn example.
.deployment
[config]
SCM_DO_BUILD_DURING_DEPLOYMENT=false
This time, we simulate the incompatibility between numpy==1.21.0 and Python 3.10.
requirements.txt
Flask==3.1.0
gunicorn==23.0.0
numpy==1.21.0
We will skip the local execution part and move directly to creating a Linux Web App running Python 3.10.
Configure the same environment variables as before, and define the deployment method as runtime build.
SCM_DO_BUILD_DURING_DEPLOYMENT=false
WEBSITE_RUN_FROM_PACKAGE=false
ENABLE_ORYX_BUILD=false
After deployment, Deployment Center shows a successful publish.
However, the actual website displays an error.
At this point, you must check the deployment log files mentioned earlier.
You will find two key logs:
1. docker.log
-
Displays real-time logs of the platform creating and starting the container.
-
In this case, you will see that the health probe exceeded the default 230-second startup window, causing container startup failure.
-
This tells us the root cause is container startup timeout.
To determine why it timed out, we must inspect the second file.
2. default_docker.log
-
Contains the internal execution logs of the container.
-
Not generated in real time, usually delayed around 15 minutes.
-
Therefore, if docker.log shows a timeout error, wait at least 15 minutes to allow the logs to be written here.
In this example, the internal log shows that numpy was being compiled during pip install, and the compilation step took too long.
We now have a concrete diagnosis: numpy 1.21.0 is not compatible with Python 3.10, which forces pip to compile from source.
The compilation exceeds the platform’s startup time limit (230 seconds) and causes the container to fail.
We can verify this by checking numpy’s official site:
numpy 1.21.0 only provides wheels for cp37, cp38, cp39 but not cp310 (which is python 3.10).
Thus, compilation becomes unavoidable.
Possible Solutions
-
Set the environment variable
WEBSITES_CONTAINER_START_TIME_LIMIT
to increase the allowed container startup time.
-
Downgrade Python to 3.9 or earlier.
-
Upgrade numpy to 1.21.0+, where suitable wheels for Python 3.10 are available.
In this example, we choose this option.
After upgrading numpy to version 1.25.0 (which supports Python 3.10) from specifying in requirements.txt and redeploying, the issue is resolved.
requirements.txt
Flask==3.1.0
gunicorn==23.0.0
numpy==1.25.0
Memory
The final example concerns the App Service SKU.
AI packages such as Streamlit, PyTorch, and others require significant memory.
Any one of these packages may cause the build process to fail due to insufficient memory.
The error messages vary widely each time.
If you repeatedly encounter unexplained build failures, check Deployment Center or default_docker.log for Exit Code 137, which indicates that the system ran out of memory during the build.
The only solution in such cases is to scale up.
4. Conclusion
This article introduced several common troubleshooting techniques for resolving Linux Web App issues caused during the build stage.
Most of these problems relate to package compatibility, although the symptoms may vary greatly.
By understanding the debugging process demonstrated in these examples, you will be better prepared to diagnose and resolve similar issues in future deployments.