Reference architectures provide a consistent approach and best practices for a given solution. Each architecture includes recommended practices, along with considerations for scalability, availability, manageability, and security. This architecture includes a deployable solution as well. The full array of reference architectures is available on the
Azure Architecture Center
This reference architecture shows how to deploy Python models as web services to make real-time predictions. Two scenarios are covered: deploying regular Python models, and the specific requirements of deploying deep learning models. Both scenarios use the architecture shown.
This architecture consists of the following components:
(VM) is shown as an example of a device—local or in the cloud—that can send an HTTP request.
Azure Kubernetes Service
(AKS) is used to deploy the application on a Kubernetes cluster. AKS simplifies the deployment and operations of Kubernetes. The cluster can be configured using CPU-only VMs for regular Python models or GPU-enabled VMs for deep learning models.
, provisioned by AKS, is used to expose the service externally. Traffic from the load balancer is directed to the back-end pods.
is used to store the Docker image that is deployed on Kubernetes cluster. Docker Hub was chosen for this architecture because it's easy to use and is the default image repository for Docker users.
Azure Container Registry
can also be used for this architecture.