Blog Post

Azure High Performance Computing (HPC) Blog
8 MIN READ

Scaling physics-based digital twins: Neural Concept on Azure delivers a New Record in Industrial AI

lmiroslaw's avatar
lmiroslaw
Icon for Microsoft rankMicrosoft
Jan 12, 2026

Neural Concept, the AI-first engineering platform achieved state-of-the-art accuracy on MIT’s DrivAerNet++ aerodynamic benchmark, where using Microsoft Azure HPC&AI infrastructure, the platform processed 39 TB of CFD data into an end-to-end, production-ready workflow in one week and outperformed state-of-the-art solutions.

Automotive Design and the DrivAerNet++ Benchmark

In automotive design, external aerodynamics have a direct impact on performance, energy efficiency, and development cost. Even small reductions in drag can translate into significant fuel savings or extended EV range. As development timelines accelerate, engineering teams increasingly rely on data-driven methods to augment or replace traditional CFD workflows.

MIT’s DrivAerNet++ dataset is the largest open multimodal dataset for automotive aerodynamics, offering a large-scale benchmark for evaluating learning-based approaches that capture the physical signals required by engineers. It includes 8,000 vehicle geometries across 3 variants (fastback, notchback and estate-back) and aggregates 39 TB of high-fidelity CFD outputs such as surface pressure, wall shear stress, volumetric flow fields, and drag coefficients.

Benchmark Highlights

Neural Concept trained its geometry-native Geometric Regressor, designed to handle any type of engineering data. The benchmark was executed     on Azure HPC infrastructure to evaluate the capabilities of the geometry-native platform under transparent, scalable, and fully reproducible conditions.

  • Surface pressure: Lowest prediction error recorded on the benchmark, revealing where high- and low-pressure zones form.
  • Wall shear stress: Outperforming all competing methods to detect flow attachment and separation for drag and stability control.
  • Volumetric velocity field: More than 50% lower error than previous best, capturing full flow structure for wake stability analysis.
  • Drag coefficient Cd: R² of 0.978 on the test set, accurate enough for early design screening without full CFD runs. 
  • Dataset Scale and Ingestion: 39 TB of data was ingested into Neural Concept’s platform through a parallel conversion task with 128 workers and 5 GB RAM each that finished in about 1 hour and produced a compact 3 TB dataset in the platform’s native format.
  • Data Pre Processing: Pre-processing the dataset required both large-scale parallelization and the application of our domain-specific best practices for handling external aerodynamics workflows.
  • Model Training and Deployment: Training completed in 24 hours on 4 A100 GPUs, with the best model obtained after 16 hours. The final model is compact and real-time predictions can be served on a single 16 GB GPU for industrial use.

Neural Concept outperformed all other competing methods, achieving state-of-the-art performance prediction on all metrics and physical quantities within a week:

“Neural Concept’s breakthrough demonstrates the power of combining advanced AI with the scalability of Microsoft Azure,” said Jack Kabat, Partner, Azure HPC and AI Infrastructure Products, Microsoft. “By running training and deployment on Azure’s high-performance infrastructure — specifically the NC A100 Virtual Machine— Neural Concept was able to transform 39 terabytes of data into a production-ready workflow in just one week.    This shows how Azure accelerates innovation and helps automotive manufacturers bring better products to market faster.” 

For additional benchmark metrics and comparisons, please refer to the Detailed Quantitative Results section at the end of the article.

From State-Of-The-Art Benchmark Accuracy to Proven Industrial Impact

Model accuracy alone is necessary, but not sufficient for industrial impact.  Transformative gains at scale and over time are only revealed once high-performing models are deployed into maintainable and repeatable workflows across organizations.

Customers using Neural Concept’s platform have achieved:

  • 30% shorter design cycles
  • $20M in savings on a 100,000-unit vehicle program

These outcomes fundamentally result from a transformed, systematic approach to design, unlocking better and faster data-driven decisions. The Design Lab interface, described in the next section, is at the core of this transformation. 

Within Neural Concept’s ecosystem, validated geometry and physics models can be deployed directly into the Design Lab - a collaborative environment where aerodynamicists and designers evaluate concepts in real time. AI copilots provide instant performance feedback, geometry-aware improvement suggestions, and live KPI updates, effectively reconnecting aerodynamic analysis with the pace of modern vehicle design.

CES 2026: See how OEMs are transforming product development with Engineering Intelligence

 Neural Concept and Microsoft will showcase how AI-native aerodynamic workflows can reshape vehicle development — from real-time design exploration to enterprise-scale deployment. Visit the Microsoft booth to see DrivAerNet++ running on Azure HPC and meet the teams shaping the future of automotive engineering.

Visit Microsoft Booth to find out more


Neural Concept’s executive team will also be at CES to share flagship results achieved by leading OEMs and Tier-1 suppliers already using the platform in production. Learn more on:   https://www.neuralconcept.com/ces-2026


Credits
Microsoft: Hugo Meiland (Principal Program Manager), Guy Bursell (Director Business Strategy, Manufacturing), Fernando Aznar Cornejo (Product Marketing Manager) and Dr. Lukasz Miroslaw (Sr. Industry Advisor)

Neural Concept: Theophile Allard (CTO), Benoit Guillard (Senior ML Research Scientist), Alexander Gorgin (Product Marketing Engineer), Konstantinos Samaras-Tsakiris (Software Engineer)

Detailed Quantitative Results

In the sections that follow, we share the results obtained by applying Neural Concept’s aerodynamics predictive model training template to Drivaernet++.

We evaluated our model’s prediction errors using the official train/test split and the standard evaluation strategy. For comparison, metrics from other methods were taken from the public leaderboard.     We reported both Mean Squared Error (MSE) and Mean Absolute Error (MAE) to quantify prediction accuracy. Lower values for either metric indicate closer agreement with the ground truth simulations, meaning better predictions.

1. Surface Field Predictions: Pressure and Wall Shear Stress   

We began by evaluating predictions for the two physical quantities defined on the vehicle surface.

Surface Pressure

The Geometric Regressor   achieved substantially better performance than all existing methods in predicting surface pressure distribution.

 

Rank

Deep Learning Model

MSE (*10-2, lower = better)

MAE (*10-1, lower = better)

#1

Neural Concept

3.98

1.08

#2

GAOT (May 2025)

4.94

1.10

#3

FIGConvNet (February 2025)

4.99

1.22

#4

TripNet
(March 2025)

5.14

1.25

#5

RegDGCNN

(June 2024)

8.29

1.61

Table 1: Neural Concept’s Geometric Regressor predicts surface pressure more accurately than previously published state-of-the-art methods. The dates indicate when the competing model architectures were published.

 

Figure 1: Side-by-side comparison of the ground truth pressure field (left), Neural Concept model’s prediction (middle), and the corresponding error for a representative test sample (right).

 

 

Wall Shear Stress

Similarly, the model delivered top-tier results, outperforming all competing methods.

Rank

Deep Learning Model

MSE (*10-2, lower = better)

MAE (*10-1, lower = better)

#1

Neural Concept

7.80

1.44

#2

GAOT (May 2025)

8.74

1.57

#3

TripNet (March 2025)

9.52

2.15

#4

FIGConvNet
(Feb. 2025)

9.86

2.22

#5

RegDGCNN
(June 2024)

13.82

3.64

Table 2: Neural Concept’s Geometric Regressor predicts wall shear stress more accurately than previously published state-of-the-art methods.

 

Figure 2: Side-by-side comparison of the ground truth magnitude of the wall shear stress, Neural Concept model’s prediction, and the corresponding error for a representative test sample.

Across both surface fields (pressure and wall shear stress), the Geometric Regressor achieved the lowest MSE and MAE by a clear margin. The baseline methods represent several high-quality and recent academic work (the earliest being from June 2024), yet our architecture established a new state-of-the-art in predictive performance.

2. Volumetric Predictions: Velocity 

Beyond surface quantities, DrivAerNet++ provides 3D velocity fields in the flow volume surrounding the vehicle, which we also predicted using the Geometric Regressor.

 

Rank

Deep Learning Model

MSE (lower = better)

MAE (*10-1, lower = better)

#1

Neural Concept

3.11

9.22

#2

TripNet (March 2025)

6.71

15.2

Table 3: Neural Concept’s Geometric Regressor predicts velocity more accurately than the previously published state-of-the-art method.

The illustration below shows the velocity magnitude for two test samples. Note that only a single 2D slice of the 3D volumetric domain is shown here, focusing on the wake region behind the car. In practice, the network predicts velocity at any location within the full 3D domain, not just on this slice.  

 

 

Figure 3: Velocity magnitude for two test samples, arranged in two columns (left and right). For each sample, the top row displays the simulated velocity field, the middle row shows the prediction from the network, and the bottom row presents the error between the two.

 

3. Scalar Predictions: Drag Coefficient

The drag coefficient (Cd) is the most critical parameter in automotive aerodynamics, as reducing it directly translates to lower fuel consumption in combustion vehicles and increased range in electric vehicles. Using the same underlying architecture, our model achieved state-of-the-art performance in Cd prediction.

In addition to MSE and MAE, we reported the Maximum Absolute Error (Max AE) to reflect worst-case accuracy. We also included the Coefficient of Determination (R² score), which measures the proportion of variance explained by the model. An R² value of 1 indicates a perfect fit to the target data.

 

Rank

Deep Learning Model

MSE (*1e-5)

MAE (*1e-3)

Max AE (*1e-2)

#1

Neural Concept

0.8

2.22

1.13

0.978

#2

TripNet

9.1

7.19

7.70

0.957

#3

PointNet

14.9

9.60

12.45

0.643

#4

RegDGCNN

14.2

9.31

12.79

0.641

#5

GCNN

17.1

10.43

15.03

0.596

On the official split, the model shows tight agreement with CFD (R² of 0.978) across the test set, which is sufficient for early design screening where engineers need to rank variants confidently and spot meaningful gains without running full simulations for every change.

4. Compute Efficiency and Azure HPC&AI Collaboration

Executing the full DrivAerNet++ benchmark at industrial scale required Neural Concept’s full software and infrastructure stack combined with seamless cloud integration on Microsoft Azure to dynamically scale computing resources on demand. The entire pipeline runs natively on Microsoft Azure and can scale within minutes, allowing us to process new industrial datasets that contain thousands of geometries without complex capacity planning.    

Dataset Scale and Ingestion

DrivAerNet++ dataset contains 8000 car designs along with their corresponding CFD simulations. The raw dataset occupies approximately 39TB of storage. Generating the simulations required a total of about 3 million CPU hours by MIT’s DeCoDE Lab.

Ingestion into Neural Concept’s platform is the first step of the pipeline.

  • To convert the raw data into the platform’s native format, we use a Conversion task that transforms raw files into the platform’s optimized native format.
  • This task was parallelized with 128 workers; each allocated 5 GB of RAM.

As a result, the entire conversion process was completed in approximately one hour only. After converting the relevant data (car geometry, wall shear stress, pressure, and velocity), the full dataset occupies approximately 3 TB in Neural Concept’s native format.

Data Pre-Processing

Pre-processing the dataset required both large-scale parallelization and the application of our domain-specific best practices. During this phase, workloads were distributed across multiple compute nodes with peak memory usage   reaching approximately 1.5 TB of RAM.

The pre-processing pipeline consists of two main stages. In the first stage, we repaired the car meshes and pre-computed geometric features needed for training. The second stage involved filtering the volumetric domain and re-sampling points to follow a spatial distribution that is more efficient for training our deep learning model.

We scaled the compute resources so that each of the two stages in the pipeline completes in 1 to 3 hours when processing the full dataset. The first stage is the most computationally intensive. To handle it efficiently, we parallelized the task across 256 independent workers, each allocated 6 GB of RAM.

Model Training and Deployment

While we use state-of-the-art hardware for training, our performance gains come primarily from model design. Once trained, the model remains lightweight and cost-effective to run.

  • Training was performed on Azure Standard_NC96ads_A100_v4 node, which provided access to four A100 GPUs, each with 80 GB of memory.

  • The model was trained for approximately 24 hours

Neural Concept’s Geometric Regressor achieved the best reported performance on the official benchmark for surface pressure, wall shear stress, volumetric velocity and drag prediction.  

Updated Jan 09, 2026
Version 1.0
No CommentsBe the first to comment