At the release of Windows Server 2019 last year, we announced support for a set of hardware devices in Windows containers. One popular type of device missing support at the time: GPUs. We’ve heard frequent feedback that you want hardware acceleration for your Windows container workloads, so today, we’re pleased to announce the first step on that journey: starting in Windows Server 2019, we now support GPU acceleration for DirectX-based apps and frameworks in Windows containers.
The best part is, you can use the Windows Server 2019 build you have today—no new OS patches or configuration is necessary. All you need is a new build of Docker and the latest display drivers. Read on for detailed requirements and to learn how you can get started with GPU accelerated DirectX in Windows containers today.
Background: Why GPU acceleration?
Containers are an excellent tool for packaging and deploying many kinds of workloads. For many of these, traditional CPU compute resources are sufficient. However, for a certain class of workload, the massively parallel compute power offered by GPUs (graphics processing units) can speed up operations by orders of magnitude, bringing down cost and improving throughput immensely.
GPUs are already a common tool for many popular workloads, from traditional rendering and simulation to machine learning training and inference. With today’s announcement, we’re unlocking new app scenarios for Windows containers and enabling more applications to be successfully shifted into Windows containers.
GPU-accelerated DirectX, Windows ML, and more
For some users, DirectX conjures associations with gaming. But DirectX is about more than games—it also powers a large ecosystem of multimedia, design, computation, and simulation frameworks and applications.
As we looked at adding GPU support to Windows containers, it was clear that starting with the DirectX APIs—the foundation of accelerated graphics, compute, and AI on Windows—was a natural first step.
By enabling GPU acceleration for DirectX, we’ve also enabled GPU acceleration for the frameworks built on top of it. One such framework is Windows ML, a set of APIs providing fast and efficient AI inferencing capabilities. With GPU acceleration in Windows containers, developers now have access to a first-class inferencing runtime that can be accelerated across a broad set of capable GPU acceleration hardware.
On a system meeting the requirements (see below), start a container with hardware-accelerated DirectX support by specifying the --device option at container runtime, as follows:
docker run --isolation process --device class/5B45201D-F2F2-4F3B-85BB-30FF1F953599 <your Docker image>
Note that this does not assign GPU resources exclusively to the container, nor does it prevent GPU access on the host. Rather, GPU resources are scheduled dynamically across the host and containers in much the same way as they are scheduled among apps running on your personal device today. You can have several Windows containers running on a host, each with hardware-accelerated DirectX capabilities.
For this feature to work, your environment must meet the following requirements:
The container host must be running Windows Server 2019 or Windows 10, version 1809 or newer.
The container base image must be mcr.microsoft.com/windows:1809 or newer. Windows Server Core and Nano Server container images are not currently supported.
The container must be run in process isolation mode. Hyper-V isolation mode is not currently supported.
The container host must be running Docker Engine 19.03 or newer.
The container host must have a GPU running display drivers version WDDM 2.5 or newer.
To check the WDDM version of your display drivers, run the DirectX Diagnostic Tool (dxdiag.exe) on your container host. In the tool’s “Display” tab, look in the “Drivers” section as indicated below.
Operating system support for this feature is already complete and broadly available as part of Windows Server 2019 and Windows 10, version 1809. Formal Docker support is scheduled for the upcoming Docker EE Engine 19.03 release. Until then, if you’re eager to try out the feature early, you can check out our sample on GitHub and follow the README instructions to get started. We’ll show you how to acquire a nightly build of Docker and use it to run a containerized Windows ML inferencing app with GPU acceleration.
We look forward to getting your feedback on this experience. Please leave a comment below or tweet us with your thoughts. What are the next things you’d like to be able to do with GPU acceleration in containers on Windows?