Blog Post

Azure PaaS Blog
2 MIN READ

Install Python on a windows node using a start task with Azure Batch

carlosbermudezlopez's avatar
May 16, 2021

It is common that customers contact the Azure Batch Team to provide instructions on how to install Python using the start task feature. I would like to provide the steps to perform this task in case that someone needs to work in a similar case.


Required steps:

  • Get the python release that you want to install.
  • Add the installer into a storage container.
  • Create the windows pool and define the required start task and the resource files to install the features.

Get the python release that you want to install.
 

First, download the python installer from the official site (Example below is using the 3.8.0 version for Windows) 
Download Python | Python.org

Add the installer into a storage container.

Next step is to upload the installer to a Storage Blob Container, this installer will be downloaded to the Azure Batch Node using the Resource Files Feature in feature steps.
 

Select your Storage Account and create a new container/ select an existing one, then upload the installer.
 

 

Create a Windows Pool, and define the required start task and resource files.
 

Next step is to create our Windows Batch pool. We will use the required fields to create a pool, which requires to enable the Start task section.
 

 

By using the start task run on each compute node as it joins the pool, the task will be executing when the node is added to the pool or when the node is restarted.

Once the start task configuration is enabled, we will need to define the required command line for the installation. We are using the following command line cmd /c "python-3.8.10-amd64.exe /quiet InstallAllUsers=1 PrependPath=1 Include_test=0"

However, you can find all the required command lines to execute the installation in the following link.


3. Using Python on Windows — Python 3.9.5 documentation

It is important to set the User Identity to Pool Autouser, Admin to grant administrative privileges.

 

 

Additionally, we need to define the Resource files that Azure Batch downloads to the compute node before running the command line, so once you click on the resource files option you need to click on the “pick storage blob” and it will open an interactive window where you can navigate through your storage explorer and select the installer.
 

Important: Check the Public access level of the blob container, if you leave it as Private(no anonymous access), like in the above example, you will need to specify the “Include SaS key” when you select the resource file, otherwise you will get authentication problems. However, If the access level is set to Blob or Container it will be public.

If you are using a Private access, then it is required to mark the Include SAS and set an expiration date before adding the Resources files
 

 

Finally, once the VM node is ready and the Start task finished, you can access the node using RDP and confirm that python is now installed.
 

 

You can execute the following command to confirm that the correct python version is installed.
 

Updated May 12, 2021
Version 1.0
  • Gunanidhi's avatar
    Gunanidhi
    Copper Contributor

    Hey Guys,

    is this article applicable to the current version of azure batch. I have followed all the steps but my node is not coming up and running. it is waiting for task to complete and after 10 mins, it moved to IDLE mode. Any help will be appreciated. 

  • unexpectederror's avatar
    unexpectederror
    Copper Contributor

    Alternative: put all commands in a install.bat file and call cmd /c "install.bat".

  • Kev74's avatar
    Kev74
    Copper Contributor

    minalghule Hello, you need to write only one command line for a task.

     

    Only one cmd /c per row. Don't write two or more cmd /c in command line.

     

    Exemple :

     

    cmd /c "python-3.12.0-amd64.exe /quiet InstallAllUsers=1 PrependPath=1 Include_test=0"
    cmd /c "curl -fSsL https://bootstrap.pypa.io/get-pip.py | python"
    cmd /c "pip install azure-storage-blob"
    cmd /c "pip install pandas" 

     

    This task will only complete the first row : cmd /c "python-3.12.0-amd64.exe /quiet InstallAllUsers=1 PrependPath=1 Include_test=0" Others rows are ignored.

     

    So if you want to install libraries after. I advice you to create one task adding the installation for Python : cmd /c "python-3.12.0-amd64.exe /quiet InstallAllUsers=1 PrependPath=1 Include_test=0"

    And an other task for installing librairies : cmd /c "pip install pandas flask holidays"

  • shivam50's avatar
    shivam50
    Copper Contributor

    hey guys, i am currently facing an issue where i have built an etl pipeline in azure data factory where i use blob trigger to activate the custom task on batch account, i have set up start task for the nodes where i install python and pip and other libraries but python and pip gets installed however the packages don't and the code throws an error saying package missing i have tried using requirements.txt and also whl files for the packages but got no luck, can anyone help me with this issue, thank you in advance

     
     

     


    the image above is an screenshot of start task command
    i have checked that pip and python gets installed by logging in the vd(virtual desktop) and trying commands in the cmd however the python package don't get installed, and even after installing them in the vd as per my knowledge it creates new instance every time the job runs hence i cant run the code even if i install it in the vd it shows the same error i am guessing it uses new instance every time