Forum Discussion

mohitfoundazure's avatar
mohitfoundazure
Copper Contributor
Aug 10, 2023

Build pipeline failure: Directory error

Hi

 

I am facing an error while saving a trained model in the build pipeline.  Here is my model training script (azure ml v1) to train the model

 

 

Here is the error from the

 

The path as shown below  is not a directory but it triggers a IsADirectoryError and is quite confusing to understand the actual issue.

 

 

 

Model Score :  0.9333333333333333

Saving file: /mnt/azureml/cr/j/36bcb34581c24a669f6c9ca86580cd76/exe/wd/iris_model.pkl
Cleaning up all outstanding Run operations, waiting 300.0 seconds
1 items cleaning up...
Cleanup took 0.12509465217590332 seconds
Traceback (most recent call last):
  File "training.py", line 226, in <module>
    iris_classifier.create_pipeline()
  File "training.py", line 115, in create_pipeline
    with open(filePath, 'wb') as file:
IsADirectoryError: [Errno 21] Is a directory: '/mnt/azureml/cr/j/36bcb34581c24a669f6c9ca86580cd76/exe/wd/iris_model.pkl'

 

 

Python (3.9) Code to save the trained model

 

Thanks

1 Reply

  • Consider this:

     

    1. Check if the Path Already Exists as a Directory

    Before saving, add a check:

    import os
    file_path = '/mnt/azureml/.../iris_model.pkl'
    
    # If a directory exists with the same name, remove it
    if os.path.isdir(file_path):
        os.rmdir(file_path)
    
    # Now save the model
    with open(file_path, 'wb') as file:
        pickle.dump(model, file)

     

    1. Use a Dedicated Output Directory

    Instead of saving directly into the working directory, create a subfolder:

    output_dir = os.path.join(os.getcwd(), 'outputs')
    os.makedirs(output_dir, exist_ok=True)
    
    file_path = os.path.join(output_dir, 'iris_model.pkl')
    with open(file_path, 'wb') as file:
        pickle.dump(model, file)

    This avoids naming collisions and aligns with Azure ML’s best practices.

    1. Avoid Hardcoding Paths

    Use environment variables or Run.get_context() to dynamically get the correct output path:

    from azureml.core.run import Run
    run = Run.get_context()
    output_dir = run.output_datasets['model_output'].path  # if defined

     

Resources