Blog Post

Azure Infrastructure Blog
11 MIN READ

Building an End-to-End MLOps Pipeline: From Training to Managed Endpoints on Azure

Gapandey's avatar
Gapandey
Icon for Microsoft rankMicrosoft
Apr 09, 2026

Introduction

Machine learning models are only as valuable as the infrastructure that supports them. A model trained in a Jupyter notebook and saved to a shared folder creates a chain of problems: no versioning, no reproducibility, no clear ownership, and no automated path to production. When the data scientist who trained it goes on vacation, nobody knows how to retrain it or where the latest version lives.

A well-designed MLOps pipeline solves all of this. It makes training repeatable, artifacts versioned, and deployment automated — so that the path from code change to live endpoint is a single merge to main.

This post provides a generic, end-to-end pattern covering the full lifecycle:

  1. Train a scikit-learn model against data in Azure Blob Storage
  2. Serialize the model as a self-contained pickle bundle
  3. Register it in an Azure ML Registry for cross-team discovery
  4. Deploy it to an Azure ML Managed Online Endpoint for real-time scoring

You can adapt this template for any scikit-learn model — classification, regression, clustering, or anomaly detection — by swapping in your own training and scoring scripts.

When to Use This Pattern

This pipeline template is a good fit when:

  • Your training data lives in Azure Blob Storage (Parquet, CSV, or similar)
  • You use scikit-learn (or any Python ML framework) for model training
  • You need versioned model artifacts in a central registry
  • You want an automated deployment path to a live scoring endpoint
  • Downstream consumers (scoring pipelines, APIs, dashboards) need a reliable handoff mechanism
  • You want to eliminate ad-hoc notebook-based training with no versioning or reproducibility

It is not the right fit if you need distributed training (use Azure ML pipelines instead), or if your model requires GPU inference (managed endpoints support GPU, but the config differs from what's shown here).

Architecture Overview

The pipeline follows a four-stage flow:

DevOps Gate → Train & Publish Artifact → Register in ML Registry → Deploy to Managed Endpoint

  1. DevOps Stage — A required gate that logs the build number and validates the pipeline is running.
  2. Train Stage — Installs Python dependencies, runs the training script against data in Azure Blob Storage, and publishes the pickle bundle as a pipeline artifact.
  3. Register Stage — Downloads the artifact and registers it in an Azure ML Registry with automatic versioning.
  4. Deploy Stage — Creates (or updates) a Managed Online Endpoint and deploys the newly registered model version to it for real-time scoring.

The first three stages run on every push to main. The Deploy stage can be gated with a manual approval if you want human review before going live.

The Training Script

The training script is the core of this pipeline — everything else is orchestration around it. It's a standalone Python CLI that you should be able to run locally before it ever touches a pipeline.

The general shape is:

  1. Load data from Azure Blob Storage (Parquet, CSV, etc.) using libraries like adlfs and pyarrow.
  2. Validate the schema — check that expected columns exist, types are correct, and there are enough rows to train on. Fail fast with a clear error message if not.
  3. Engineer features — compute derived columns, handle missing values, encode categorical. This is where most of the domain-specific logic lives.
  4. Train the model using scikit-learn (or your framework of choice).
  5. Apply preprocessing (e.g., StandardScaler) and save the preprocessor alongside the model so that scoring uses the exact same transformations.
  6. Serialize a bundle containing the model, preprocessor, feature column order, and training metadata into a single pickle file.

The script reads storage credentials from environment variables, keeping secrets out of the codebase entirely. It accepts an --output-path argument and writes the serialized bundle to that location — which the pipeline later publishes as an artifact.

What Goes in the Bundle

The pickle file isn't just the model — it's a self-contained scoring contract. Here's what's inside and why:

KeyTypePurpose
modelscikit-learn estimatorThe trained model (e.g., IsolationForest, RandomForestClassifier)
scalerStandardScaler (or similar)The exact preprocessor fitted on training data — scoring must use the same transform
feature_orderlist[str]Column names in the exact order the model expects — prevents silent column reordering bugs
metadata.trained_atISO timestampWhen the model was trained — useful for debugging stale predictions
metadata.source_rowsintHow many rows were in the raw data — helps detect data pipeline issues
metadata.clean_rowsintHow many rows survived cleaning — a sudden drop signals a data quality problem
metadata.scikit_learn_versionstrThe scikit-learn version used — pickle compatibility can break across major versions

This structure means any consumer can load the bundle, inspect what's in it, and score new data without knowing anything about how the model was trained.

Choosing a Serialization Format

This template uses pickle, but you should choose based on your needs:

FormatBest ForTrade-off
pickleBundles with metadata (model + scaler + feature order + config)Built-in, no extra deps. Not safe to load from untrusted sources.
joblibLarge NumPy array-heavy modelsFaster for large arrays, but adds a dependency.
ONNXCross-framework interop (PyTorch ↔ scikit-learn)Portable, but not all model types are supported.

Pickle works well when your artifact is a self-contained bundle — model, preprocessor, feature column order, and training metadata in one file. Any consumer who loads it gets everything needed to score new data correctly.

Security note: Never load pickle files from untrusted sources — deserialization can execute arbitrary code. This is safe when the pickle is produced by your own pipeline and stored in an access-controlled registry, but always validate provenance.

The Pipeline YAML

Here's the full pipeline template. Replace <your-...> placeholders with your values:

trigger:
  branches:
    include:
      - main
  paths:
    include:
      - <your-model-source-path>/*       # e.g., src/models/anomaly-detection/*

stages:
  - stage: DevOps
    displayName: Required DevOps Stage
    jobs:
      - job: Echo
        steps:
          - script: echo build initiated - $(Build.BuildNumber)

  - stage: Train
    dependsOn: DevOps
    displayName: 'Train Model & Publish Artifact'
    jobs:
      - job: TrainModel
        steps:
          - checkout: self

          - task: UsePythonVersion@0
            inputs:
              versionSpec: '3.12'         # Use a supported Python version

          - script: |
              python -m pip install --upgrade pip
              pip install -r requirements.txt
            displayName: 'Install Python dependencies'

          - script: |
              python <your-training-script>.py \
                --output-path "$(Build.ArtifactStagingDirectory)/model_bundle.pkl"
            displayName: 'Train model'
            env:
              AZURE_STORAGE_ACCOUNT_NAME: $(AZURE_STORAGE_ACCOUNT_NAME)
              AZURE_STORAGE_ACCOUNT_KEY: $(AZURE_STORAGE_ACCOUNT_KEY)   # See note on Managed Identity below

          - task: PublishPipelineArtifact@1                              # Use the modern task
            inputs:
              artifactName: 'model-pkl'
              targetPath: '$(Build.ArtifactStagingDirectory)/model_bundle.pkl'

  - stage: Register
    dependsOn: Train
    displayName: 'Register Model in ML Registry'
    jobs:
      - job: RegisterModel
        steps:
          - task: DownloadPipelineArtifact@2                            # Use the modern task
            inputs:
              artifactName: 'model-pkl'
              targetPath: '$(System.ArtifactsDirectory)/model-pkl'

          - task: AzureCLI@2
            displayName: 'Register model in ML Registry'
            inputs:
              azureSubscription: '<your-service-connection>'
              scriptType: 'ps'
              scriptLocation: 'inlineScript'
              inlineScript: |
                az extension add -n ml --yes
                az ml model create `
                  --name <your-model-name> `
                  --path "$(System.ArtifactsDirectory)/model-pkl/model_bundle.pkl" `
                  --type custom_model `
                  --registry-name <your-ml-registry> `
                  --resource-group <your-resource-group>

Placeholder Reference

PlaceholderDescriptionExample
<your-model-source-path>Path to your model code in the reposrc/models/anomaly-detection
<your-training-script>Your Python training scripttrain_model.py
<your-service-connection>Azure DevOps service connection nameprod-ml-connection
<your-model-name>Name for the model in the registrysales-anomaly-detector
<your-ml-registry>Azure ML Registry namecontoso-ml-registry
<your-resource-group>Resource group containing the registryrg-ml-prod

Key Design Decisions

Credentials as environment variables — Storage credentials are stored in an Azure DevOps variable group and injected via the env: block. They never appear on the command line or in logs.

Prefer Managed Identity over keys. The template above shows AZURE_STORAGE_ACCOUNT_KEY for simplicity, but the recommended approach is to authenticate using a User Managed Identity (UMI) with the Storage Blob Data Reader role. This eliminates key rotation and reduces the credential surface. If your agent supports Managed Identity (e.g., self-hosted on an Azure VM), use DefaultAzureCredential in your training script instead of account keys.

Separate Train and Register stages — The training artifact is published as a pipeline artifact between stages. This means if registration fails, you don't have to retrain. It also gives you a downloadable artifact in Azure DevOps for debugging.

az ml model create with --registry-name — This registers the model in an Azure ML Registry (not a workspace). Registries are shared across workspaces and teams, making the model accessible to anyone with the right permissions.

Auto-versioning — Each az ml model create call with the same --name automatically increments the version number in the registry. No manual version management needed.

Permissions

The pipeline authenticates using a User Managed Identity (UMI) linked to an Azure DevOps service connection via workload identity federation. The UMI needs:

RoleScopePurpose
Storage Blob Data ReaderStorage account or containerRead training data
AzureML Registry UserML RegistryRegister model artifacts
AzureML Data ScientistML WorkspaceCreate/update managed endpoints and deployments

No Contributor or Owner access at the subscription or resource group level is required. Least-privilege access keeps the blast radius small.

Workload Identity Federation vs. secrets: If your Azure DevOps service connection uses workload identity federation (recommended), the UMI authenticates without any stored secrets. If using a service principal with client secret instead, store the secret in an Azure DevOps variable group marked as secret, and rotate it regularly.

Common Pitfalls

These are issues you'll likely hit when adapting this template:

Column name mismatches. Parquet files may have column names like periodid while your script expects Period ID. Add a case-insensitive column rename mapping in your training script and validate the data schema before training starts.

Windows agents use cmd.exe, not bash. If your pipeline runs on self-hosted Windows agents, backslash line continuations and bash-style commands won't work. Use single-line commands or PowerShell syntax, and use Windows-style path separators.

checkout: self vs named repositories. When your pipeline YAML lives in the same repo as your training code, always use checkout: self. A named repository checkout pulls the default branch, not the feature branch you're testing — leading to stale code running in your pipeline.

Start with the training script, not the pipeline. Get your training script working locally first. The pipeline is just orchestration — if the script doesn't work on your machine, it won't work in the pipeline either.

Pin your dependencies. Use a requirements.txt with pinned versions rather than inline pip install with unpinned packages. A scikit-learn minor version bump can change model behavior silently.

Deploying to a Managed Online Endpoint

Registering the model in the Azure ML Registry makes it discoverable. But for real-time scoring — where an API, dashboard, or another service sends data and gets predictions back — you need to deploy the model to a Managed Online Endpoint.

Azure ML Managed Online Endpoints handle the infrastructure: provisioning compute, load balancing, scaling, health probes, and rolling deployments. You provide the model and a scoring script.

HTTP Request (JSON) → Managed Online Endpoint → Deployment (blue) → score.py [init() / run()] + model.pkl → JSON Response (predictions)

Key concepts:

  • An endpoint is the HTTPS URL that clients call. It has auth (key or AAD token) and a DNS name.
  • deployment sits behind the endpoint and runs your scoring code + model on provisioned compute.
  • You can have multiple deployments (e.g., blue and green) behind one endpoint for A/B testing or canary rollouts, controlled by traffic splitting.

The Scoring Script

The scoring script is the glue between the endpoint and your pickle bundle. Azure ML calls init() once when the container starts, and run() on every incoming request.

# score.py — deployed alongside the model
import json
import pickle
import os
import numpy as np
import pandas as pd

def init():
    """Called once when the endpoint container starts."""
    global model_bundle
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model_bundle.pkl")
    with open(model_path, "rb") as f:
        model_bundle = pickle.load(f)
    print(f"Model loaded. Trained at: {model_bundle['metadata']['trained_at']}")
    print(f"Expected features: {model_bundle['feature_order']}")

def run(raw_data):
    """Called on every scoring request."""
    try:
        data = json.loads(raw_data)
        df = pd.DataFrame(data["input_data"])

        # Enforce feature order from the bundle
        df = df[model_bundle["feature_order"]]

        # Apply the same scaler used during training
        scaled = model_bundle["scaler"].transform(df)

        # Predict
        predictions = model_bundle["model"].predict(scaled)

        return json.dumps({
            "predictions": predictions.tolist(),
            "model_version": model_bundle["metadata"].get("scikit_learn_version", "unknown"),
        })
    except KeyError as e:
        return json.dumps({"error": f"Missing expected column: {e}"})
    except Exception as e:
        return json.dumps({"error": str(e)})

Key things to notice:

  • AZUREML_MODEL_DIR — Azure ML automatically downloads the model artifact from the registry and sets this environment variable to the local path. You never deal with storage URLs in scoring code.
  • Feature order enforcement — df[model_bundle["feature_order"]] ensures columns are in the exact order the model was trained on, even if the caller sends them in a different order.
  • Same scaler — The StandardScaler from the bundle is reused, so the numerical scaling matches training exactly. This is why we bundle the scaler with the model.

The Deploy Stage in the Pipeline

Add this stage after the Register stage. All endpoint and deployment configuration is done inline via az ml CLI parameters — no separate YAML config files needed:

- stage: Deploy
    dependsOn: Register
    displayName: 'Deploy to Managed Endpoint'
    jobs:
      - job: DeployModel
        steps:
          - checkout: self                   # to access score.py

          - task: AzureCLI@2
            displayName: 'Create or update endpoint'
            inputs:
              azureSubscription: '<your-service-connection>'
              scriptType: 'ps'
              scriptLocation: 'inlineScript'
              inlineScript: |
                az extension add -n ml --yes

                # Create endpoint if it doesn't exist (idempotent)
                $exists = az ml online-endpoint show `
                  --name <your-endpoint-name> `
                  --resource-group <your-resource-group> `
                  --workspace-name <your-workspace> 2>$null

                if (-not $exists) {
                  az ml online-endpoint create `
                    --name <your-endpoint-name> `
                    --auth-mode key `
                    --resource-group <your-resource-group> `
                    --workspace-name <your-workspace>
                }

          - task: AzureCLI@2
            displayName: 'Deploy model to endpoint'
            inputs:
              azureSubscription: '<your-service-connection>'
              scriptType: 'ps'
              scriptLocation: 'inlineScript'
              inlineScript: |
                az extension add -n ml --yes

                az ml online-deployment create `
                  --name blue `
                  --endpoint-name <your-endpoint-name> `
                  --model azureml://registries/<your-ml-registry>/models/<your-model-name>/versions/<version-number> `
                  --code-path ./scoring `
                  --scoring-script score.py `
                  --environment-image mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:latest `
                  --instance-type Standard_DS3_v2 `
                  --instance-count 1 `
                  --resource-group <your-resource-group> `
                  --workspace-name <your-workspace> `
                  --all-traffic

          - task: AzureCLI@2
            displayName: 'Smoke test the endpoint'
            inputs:
              azureSubscription: '<your-service-connection>'
              scriptType: 'ps'
              scriptLocation: 'inlineScript'
              inlineScript: |
                az extension add -n ml --yes

                # Send a test request to verify the deployment is healthy
                az ml online-endpoint invoke `
                  --name <your-endpoint-name> `
                  --resource-group <your-resource-group> `
                  --workspace-name <your-workspace> `
                  --request-file scoring/sample-request.json

Version pinning is critical. The scikit-learn version in your scoring environment must match the version used during training. Pickle deserialization can fail or produce wrong results if the versions differ.

Deploy Stage Placeholder Reference

PlaceholderDescriptionExample
<your-endpoint-name>Unique endpoint name (DNS-safe)anomaly-scoring-endpoint
<your-workspace>Azure ML Workspace nameml-workspace-prod

Complete Pipeline — All Four Stages

Here's the full pipeline structure showing how Train, Register, and Deploy connect:

stages:
  - stage: DevOps          # Gate
  - stage: Train            # Train model → publish pickle artifact
    dependsOn: DevOps
  - stage: Register         # Register pickle in Azure ML Registry
    dependsOn: Train
  - stage: Deploy           # Deploy to Managed Online Endpoint
    dependsOn: Register
    # Optional: add a manual approval gate here
    # condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))

Each stage is independently retriable. If Deploy fails, you don't retrain or re-register — you just redeploy.

Extending This Template

Once the base pipeline is working, consider these additions:

  • Model validation stage — Add a stage between Register and Deploy that runs the model against a holdout set and gates deployment on a minimum performance threshold.
  • Batch scoring pipeline — A separate pipeline or Azure Function loads the model from the registry and scores large datasets on a schedule using Azure ML Batch Endpoints.
  • Monitoring — Use Azure ML model monitoring to track data drift and prediction distributions over time. Trigger retraining automatically when drift exceeds a threshold.
  • Multi-environment promotion — Register to a dev registry first, deploy to a staging endpoint, run integration tests, then promote to production.
  • A/B testing — Use traffic splitting to evaluate a new model version against the current one on live traffic before committing.

Conclusion

An end-to-end MLOps pipeline doesn't need to be complex. The core pattern is:

  1. Train — Run the training script, serialize the model bundle
  2. Register — Push to Azure ML Registry with automatic versioning
  3. Deploy — Create/update a Managed Online Endpoint with the new version
  4. Score — Clients call a standard HTTPS API, the endpoint handles scaling

The value comes from making this repeatable and removing manual steps. Every push to main trains a fresh model, registers it, and deploys it to a live endpoint — with a rollback path through blue-green deployments if anything goes wrong.

Copy this template, replace the <your-...> placeholders, write your training script and scoring script, and you have a production-grade MLOps pipeline. The structure stays the same regardless of whether you're deploying an anomaly detector, a classifier, or a regression model.

Updated Apr 09, 2026
Version 1.0
No CommentsBe the first to comment