Blog Post

Apps on Azure Blog
11 MIN READ

Using Scikit-learn on Azure Web App

theringe's avatar
theringe
Icon for Microsoft rankMicrosoft
Nov 16, 2024

With the increasing adoption of AI frameworks in cloud environments, many users are now exploring ways to deploy Scikit-learn projects on Azure Web Apps. This tutorial provides a step-by-step guide to help you deploy your Scikit-learn project on an Azure Web App, covering everything from resource setup to troubleshooting common issues.

TOC

  1. Introduction to Scikit-learn
  2. System Architecture
    • Architecture
    • Focus of This Tutorial
  3. Setup Azure Resources
    • Web App
    • Storage
  4. Running Locally
    • File and Directory Structure
    • Training Models and Training Data
    • Predicting with the Model
  5. Publishing the Project to Azure
    • Deployment
    • Configuration
  6. Running on Azure Web App
    • Training the Model
    • Using the Model for Prediction
  7. Troubleshooting
    • Missing Environment Variables After Deployment
    • Virtual Environment Resource Lock Issues
    • Package Version Dependency Issues
    • Default Binding
    • Missing System Commands in Restricted Environments
  8. Conclusion
  9. References

 

1. Introduction to Scikit-learn

Scikit-learn is a popular open-source Python library for machine learning, built on NumPy, SciPy, and matplotlib. It offers an efficient and easy-to-use toolkit for data analysis, data mining, and predictive modeling.

Scikit-learn supports a variety of machine learning algorithms, including classification, regression, clustering, and dimensionality reduction (e.g., SVM, Random Forest, K-means). Its preprocessing utilities handle tasks like scaling, encoding, and missing data imputation. It also provides tools for model evaluation (e.g., accuracy, precision, recall) and pipeline creation, enabling users to chain preprocessing and model training into seamless workflows.

 

2. System Architecture

Architecture

 

Development Environment

OS: 

Windows 11

Version: 

24H2

Python Version:

3.7.3

 

Azure Resources

App Service Plan:

SKU - Premium Plan 0 V3

App Service:

Platform - Linux (Python 3.9, Version 3.9.19)

Storage Account:

SKU - General Purpose V2

File Share:

No backup plan

Focus of This Tutorial 

This tutorial walks you through the following stages:

  1. Setting up Azure resources
  2. Running the project locally
  3. Publishing the project to Azure
  4. Running the application on Azure
  5. Troubleshooting common issues

Each of the mentioned aspects has numerous corresponding tools and solutions. The relevant information for this session is listed in the table below.

 

Local OS

Windows

Linux

Mac

V

 

 

 

How to setup Azure resources

Portal (i.e., REST api)

ARM

Bicep

Terraform

V

 

 

 

 

How to deploy project to Azure

VSCode

CLI

Azure DevOps

GitHub Action

V

 

 

 

 

3. Setup Azure Resources

Web App

We need to create the following resources or services:

  Manual Creation Required Resource/Service
App Service Plan No Resource
App Service Yes Resource
Storage Account Yes Resource
File Share Yes Service

 

Go to the Azure Portal and create an App Service.

Important configuration:

  • OS: Select Linux (default if Python stack is chosen).
  • Stack: Select Python 3.9 to avoid dependency issues.
  • SKU: Choose at least Premium Plan to ensure enough memory for your AI workloads.

Storage

Create a Storage Account in the Azure Portal.

Create a file share named data-and-model in the Storage Account.

Mount the File Share to the App Service: Use the name data-and-model for consistency with tutorial paths.

At this point, all Azure resources and services have been successfully created. Let’s take a slight detour and mount the recently created File Share to your Windows development environment.

Navigate to the File Share you just created, and refer to the diagram below to copy the required command. Before copying, please ensure that the drive letter remains set to the default "Z" as the sample code in this tutorial will rely on it.

Return to your development environment. Open a PowerShell terminal (do not run it as Administrator) and input the command copied in the previous step, as shown in the diagram.

After executing the command, the network drive will be successfully mounted. You can open File Explorer to verify, as illustrated in the diagram.

 

4. Running Locally

File and Directory Structure

Please use VSCode to open a PowerShell terminal and enter the following commands:

git clone https://github.com/theringe/azure-appservice-ai.git
cd azure-appservice-ai
.\scikit-learn\tools\add-venv.cmd

If you are using a Linux or Mac platform, use the following alternative commands instead:

git clone https://github.com/theringe/azure-appservice-ai.git
cd azure-appservice-ai
bash ./scikit-learn/tools/add-venv.sh

After completing the execution, you should see the following directory structure:

 

File and Path Purpose
scikit-learn/tools/add-venv.* The script executed in the previous step (cmd for Windows, sh for Linux/Mac) to create all Python virtual environments required for this tutorial.
.venv/scikit-learn-webjob/ A virtual environment specifically used for training models.

scikit-learn/webjob/requirements.txt

The list of packages (with exact versions) required for the scikit-learn-webjob virtual environment.
.venv/scikit-learn/ A virtual environment specifically used for the Flask application, enabling API endpoint access for querying predictions.
scikit-learn/requirements.txt The list of packages (with exact versions) required for the scikit-learn virtual environment.
scikit-learn/ The main folder for this tutorial.
scikit-learn/tools/create-folder.* A script to create all directories required for this tutorial in the File Share, including train, model, and test.
scikit-learn/tools/download-sample-training-set.* A script to download a sample training set from the UCI Machine Learning Repository, containing heart disease data, into the train directory of the File Share.
scikit-learn/webjob/train_heart_disease_model.py A script for training the model. It loads the training set, applies a machine learning algorithm (Logistic Regression), and saves the trained model in the model directory of the File Share.
scikit-learn/webjob/train_heart_disease_model.sh A shell script for Azure App Service web jobs. It activates the scikit-learn-webjob virtual environment and starts the train_heart_disease_model.py script.
scikit-learn/webjob/train_heart_disease_model.zip A ZIP file containing the shell script for Azure web jobs. It must be recreated manually whenever train_heart_disease_model.sh is modified. Ensure it does not include any directory structure.
scikit-learn/api/app.py Code for the Flask application, including routes, port configuration, input parsing, model loading, predictions, and output generation.
scikit-learn/.deployment A configuration file for deploying the project to Azure using VSCode. It disables the default Oryx build process in favor of custom scripts.
scikit-learn/start.sh A script executed after deployment (as specified in the Portal's startup command). It sets up the virtual environment and starts the Flask application to handle web requests.

Training Models and Training Data

Return to VSCode and execute the following commands (their purpose has been described earlier).

.\.venv\scikit-learn-webjob\Scripts\Activate.ps1
.\scikit-learn\tools\create-folder.cmd
.\scikit-learn\tools\download-sample-training-set.cmd
python .\scikit-learn\webjob\train_heart_disease_model.py

If you are using a Linux or Mac platform, use the following alternative commands instead:

source .venv/scikit-learn-webjob/bin/activate
bash ./scikit-learn/tools/create-folder.sh
bash ./scikit-learn/tools/download-sample-training-set.sh
python ./scikit-learn/webjob/train_heart_disease_model.py

After execution, the File Share will now include the following directories and files.

Let’s take a brief detour to examine the structure of the training data downloaded from the public dataset website.

The right side of the figure describes the meaning of each column in the dataset, while the left side shows the actual training data (after preprocessing).

This is a predictive model that uses an individual’s physiological characteristics to determine the likelihood of having heart disease. Columns 1-13 represent various physiological features and background information of the patients, while Column 14 (originally Column 58) is the label indicating whether the individual has heart disease.

The supervised learning process involves using a large dataset containing both features and labels. Machine learning algorithms (such as neural networks, SVMs, or in this case, logistic regression) identify the key features and their ranges that differentiate between labels. The trained model is then saved and can be used in services to predict outcomes in real time by simply providing the necessary features.

Predicting with the Model

Return to VSCode and execute the following commands. First, deactivate the virtual environment used for training the model, then activate the virtual environment for the Flask application, and finally, start the Flask app.

Commands for Windows:

deactivate
.\.venv\scikit-learn\Scripts\Activate.ps1
python .\scikit-learn\api\app.py

Commands for Linux or Mac:

deactivate
source .venv/scikit-learn/bin/activate
python ./scikit-learn/api/app.py

When you see a screen similar to the following, it means the server has started successfully. Press Ctrl+C to stop the server if needed.

Before conducting the actual test, let’s construct some sample human feature data:

[63, 1, 3, 145, 233, 1, 0, 150, 0, 2.3, 0, 0, 1]
[63, 1, 3, 305, 233, 1, 0, 150, 0, 2.3, 0, 0, 1]

Referring to the feature description table from earlier, we can see that the only modified field is Column 4 ("Resting Blood Pressure"), with the second sample having an abnormally high value. (Note: Normal resting blood pressure ranges are typically 90–139 mmHg.)

Next, open a PowerShell terminal and use the following curl commands to send requests to the app:

curl -X GET http://127.0.0.1:8000/api/detect -H "Content-Type: application/json" -d '{"info": [63, 1, 3, 145, 233, 1, 0, 150, 0, 2.3, 0, 0, 1]}'
curl -X GET http://127.0.0.1:8000/api/detect -H "Content-Type: application/json" -d '{"info": [63, 1, 3, 305, 233, 1, 0, 150, 0, 2.3, 0, 0, 1]}'

You should see the prediction results, confirming that the trained model is working as expected.

5. Publishing the Project to Azure

Deployment

In the VSCode interface, right-click on the target App Service where you plan to deploy your project. Manually select the local project folder named scikit-learn as the deployment source, as shown in the image below.

Configuration

After deployment, the App Service will not be functional yet and will still display the default welcome page. This is because the App Service has not been configured to build the virtual environment and start the Flask application.

To complete the setup, go to the Azure Portal and navigate to the App Service. The following steps are critical, and their execution order must be correct. To avoid delays, it’s recommended to open two browser tabs beforehand, complete the settings in each, and apply them in sequence.

Refer to the following two images for guidance. You need to do the following:

  • Set the Startup Command: Specify the path to the script you deployed
bash /home/site/wwwroot/start.sh
  • Set Two App Settings:
    • WEBSITES_CONTAINER_START_TIME_LIMIT=600 The value is in seconds, ensuring the Startup Command can continue execution beyond the default timeout of 230 seconds. This tutorial’s Startup Command typically takes around 300 seconds, so setting it to 600 seconds provides a safety margin and accommodates future project expansion (e.g., adding more packages).
      WEBSITES_ENABLE_APP_SERVICE_STORAGE=1 This setting is required to enable the App Service storage feature, which is necessary for using web jobs (e.g., for model training).

Step-by-Step Process:

  1. Before clicking Continue, switch to the next browser tab and set up all the app settings.

  2. In the second tab, apply all app settings, then switch back to the first tab.

  3. Click Continue in the first tab and wait for several seconds for the operation to complete.

  4. Once completed, switch to the second tab and click Continue within 5 seconds.

  5. Ensure to click Continue promptly within 5 seconds after the previous step to finish all settings.

After completing the configuration, wait for about 10 minutes for the settings to take effect. Then, navigate to the WebJobs section in the Azure Portal and upload the ZIP file mentioned in the earlier sections. Set its trigger type to Manual.

At this point, the entire deployment process is complete. For future code updates, you only need to redeploy from VSCode; there is no need to reconfigure settings in the Azure Portal.

6. Running on Azure Web App

Training the Model

Go to the Azure Portal, locate your App Service, and navigate to the WebJobs section. Click on Start to initiate the job and wait for the results. During this process, you may need to manually refresh the page to check the status of the job execution. Refer to the image below for guidance.

Once you see the model report in the Logs, it indicates that the model training is complete, and the Flask app is ready for predictions.

You can also find the newly trained model in the File Share mounted in your local environment.

Using the Model for Prediction

Just like in local testing, open a PowerShell terminal and use the following curl commands to send requests to the app:

# Note: Replace both instances of scikit-learn-portal-app with the name of your web app.
curl -X GET https://scikit-learn-portal-app.azurewebsites.net/api/detect -H "Content-Type: application/json" -d '{"info": [63, 1, 3, 145, 233, 1, 0, 150, 0, 2.3, 0, 0, 1]}'
curl -X GET https://scikit-learn-portal-app.azurewebsites.net/api/detect -H "Content-Type: application/json" -d '{"info": [63, 1, 3, 305, 233, 1, 0, 150, 0, 2.3, 0, 0, 1]}'

As with the local environment, you should see the expected results.

7. Troubleshooting

Missing Environment Variables After Deployment

  • Symptom: Even after setting values in App Settings (e.g., WEBSITES_CONTAINER_START_TIME_LIMIT), they do not take effect.
  • Cause: App Settings (e.g., WEBSITES_CONTAINER_START_TIME_LIMIT, WEBSITES_ENABLE_APP_SERVICE_STORAGE) are reset after updating the startup command.
  • Resolution: Use Azure CLI or the Azure Portal to reapply the App Settings after deployment. Alternatively, set the startup command first, and then apply app settings.

Virtual Environment Resource Lock Issues

  • Symptom: The app fails to redeploy, even though no configuration or code changes were made.
  • Cause: The virtual environment folder cannot be deleted due to active resource locks from the previous process. Files or processes from the previous virtual environment session remain locked.
  • Resolution: Deactivate processes before deletion and use unique epoch-based folder names to avoid conflicts. Refer to scikit-learn/start.sh in this tutorial for implementation.

Package Version Dependency Issues

  • Symptom: Conflicts occur between package versions specified in requirements.txt and the versions required by the Python environment. This results in errors during installation or runtime.
  • Cause: Azure deployment environments enforce specific versions of Python and pre-installed packages, leading to mismatches when older or newer versions are explicitly defined. Additionally, the read-only file system in Azure App Service prevents modifying global packages like typing-extensions.
  • Resolution: Pin compatible dependency versions. For example, follow the instructions for installing scikit-learn from the scikit-learn 1.5.2 documentation. Refer to scikit-learn/requirements.txt in this tutorial.

Default Binding

  • Symptom: Despite setting the WEBSITES_PORT parameter in App Settings to match the port Flask listens on (e.g., Flask's default 5000), the deployment still fails.
  • Cause: The Flask framework's default settings are not overridden to bind to 0.0.0.0 or the required port.
  • Resolution: Explicitly bind Flask to 0.0.0.0:8000 in app.py . To avoid additional issues, it’s recommended to use the Azure Python Linux Web App's default port (8000), as this minimizes the need for extra configuration.

Missing System Commands in Restricted Environments

  • Symptom: In the WebJobs log, an error is logged stating that the ls command is missing.
  • Cause: This typically occurs in minimal environments, such as Azure App Services, containers, or highly restricted shells.
  • Resolution: Use predefined paths or variables in the script instead of relying on system commands. Refer to scikit-learn/webjob/train_heart_disease_model.sh in this tutorial for an example of handling such cases.

8. Conclusion

Azure App Service, while being a PaaS product with less flexibility compared to a VM, still offers several powerful features that allow us to fully leverage the benefits of AI frameworks. For example, the resource-intensive model training phase can be offloaded to a high-performance local machine. This approach enables the App Service to focus solely on loading models and serving predictions. Additionally, if the training dataset is frequently updated, we can configure WebJobs with scheduled triggers to retrain the model periodically, ensuring the prediction service always uses the latest version. These capabilities make Azure App Service well-suited for most business scenarios.

9. References

 

Updated Nov 22, 2024
Version 7.0
No CommentsBe the first to comment