You may want to run lots of computations, such as a parameter study or Monte-Carlo run. Running a high-performance computing simulation that needs high-bandwidth, low-latency (< 3 microsecond) supercomputer networking to scale to hundreds of cores is uniquely possible on Microsoft Azure. So you can have whatever cluster you want, lots of basic machines, or even
true HPC with InfiniBand and GPUs
. The big advantage is not having to wait in any queues – such as for
science and engineering simulations
. It’s easy to deploy using our Azure Resource Manager templates, such as a
. You can find our
detailed walkthrough for this here on Github
. Read about how Simon O’Hanlon at Imperial College London was able to
speed up his genomics research with Azure
is even more powerful, delivering a true HPC-as-a-Service model, where you can wrap your application with a simple template, and then run your HPC job without worrying about cluster management at all. You simply specify the job (e.g. input, output, number of cores), hit go, and get the results back. This is brilliant if you are running the same application over and over again, and gives you immediate access to a personal cluster, but without worrying about any cluster management.
ANSYS CFD and Microsoft Azure perform best HPC scalability in the cloud
"ANSYS and Microsoft Azure have been working closely on a Proof of Concept with a large customer to run ANSYS CFD workload on Azure,”
said Ray Milhem, Vice President of enterprise solutions at ANSYS.
“The POC proved very successful and the data showed excellent scalability running ANSYS CFD up to 1024 cores.”
Over the past several months we have worked closely with the CFD Solver team at ANSYS to make sure ANSYS CFD code runs successfully on the Azure Linux RDMA stack and scales up to thousands of cores. Azure delivers a high level of scalability and performance with ANSYS CFD because of its dedicated high-speed low-latency network fabric that uses remote direct memory access (RDMA) and Infiniband technology. RDMA computing provides very low latency close to three microseconds and 32 Gbps bandwidth. This results in great scaling performance running your large compute CFD jobs in the Azure cloud.
We ran multiple simulation models from ANSYS CFD version 16.2 on CentOS 6.5 with Intel MPI on Azure A9 instances (Intel E5-2670 at 2.6 GHz, Memory 112GB, 1600MHz DDR3, QDR InfiniBand). The three models that we ran were Combustor (71 million cells), F1 Race Car (140 million cells) and Open Race Car (280 million cells) respectively. You can clearly see the scaling numbers in the diagram above for the the F1 Race Car model.
We have been working closely with
over the past several months and see excellent performance and scalability of
Microsoft Azure Linux RDMA
platform (A9 virtual machines).
can deliver high level of scalability and performance with
because of its dedicated high speed low latency network fabric that uses RDMA and Infiniband technology, available only on A8 and A9 instances.
These sizes in
provide RDMA virtualized through Hyper-V that delivers great scaling performance because it provides very low close to three microseconds of latency and 32 Gbps bandwidth.
You can see in the chart above how
has delivered great scaling up to 1024 cores.
"We are happy to see the results of STAR-CCM+ performance testing on Microsoft Azure platform using A9 virtual machines and Linux RDMA technology. The results show good scaling up to 1,024 cores for STAR-CCM+®. This performance accelerates the pace of product design cycles using simulation and helps engineers discover better designs, faster."
- Keith Foston, product manager, STAR-CCM+
An UberCloud Experiment: OpenFOAM Modelling on Azure
OpenFOAM Modelling and Product Optimization of Dry-Type Transformers in the Cloud
case study is optimizing dry-type transformers, using
to simulate the heat transfer of a dry-type transformer unit with different dimensions. In this way, the temperature rises can be evaluated and compared, directing a way to optimize the transformer design in terms of thermal performance.
The computations were performed on a 10 node cluster in the
, where eight compute nodes were equipped with dual socket
Intel® Xeon® CPU E5-2670
, giving a total count of 128 cores and 1TB of RAM. The nodes were connected with 32Gbps
remote direct memory access (RDMA)