Formula 1 is one of the most fascinating data-driven sport – so competitive that even one tenth second of an advantage can change the outcome of a race. F1 teams strive to find that advantage by using best in class analytics tools & ML platforms, capable of analysing thousands of data points per second.
Every F1 car contains 300 sensors which generate 1.1 million telemetry data points per second transmitted from the cars to the pits – and there are 20 of those running on a circuit at any given point, every race weekend ~160 terabytes of data is generated.
This telemetry data is crucial for teams, particularly in pre-season testing, practice & qualifying for 3 main reasons –
Visualize the feedback: It provides the race engineers a graphical representation of what drivers are talking about in their feedback. e.g., a driver might complain about understeering or oversteering, which might be related to the tyre wear observed by the race engineer.
Compare performance: It allows drivers to compare their performances against each other to understand what they are doing better or worse than their peers.
Reliability check: It helps teams monitor the car, check it is running smoothly, and take the most appropriate decisions based on the data collected. For example, if the team notices brake discs overheating, they can inspect the ducts at the next pit stop to check whether a piece of debris is stuck in it. If there is a sudden drop in water pressure, the engineers will tell the driver to shut down the engine to avoid damaging it beyond repairs.
In a nutshell, telemetry data is the best way for teams to understand exactly how their cars are performing, how their drivers are performing, and analyse past races for continuous improvement. Of course, the problem is they do not know how other teams are relatively performing as this data is highly confidential and would be nothing less than gold dust if they can get hold of it.
But hey, wouldn’t it be amazing if we could step into the shoes of a F1 team manager or race engineer, and experience all of this first hand? If your answer is yes, then please continue reading as the rest of this blog is going to describe how to select & set up a technology stack to monitor Formula 1 telemetry data — from its ingestion to all the way showing it on an engineer’s dashboard.
I have broken down the flow into 4 logical steps, following diagram shows an end-to-end view.
1. Accessing the telemetry data
While we do not have access to actual F1 cars or their data, we have access to the next best thing – gaming devices viz. Xbox, PS5, PC which can emit car’s telemetry data points (speed, rpm, engine temperature, damage etc) in real-time during the live gameplay.
The F1 game by Codemasters enables publishing all the available telemetry data during a race via UDP (out of the Microsoft Xbox for example); the specification of all packets is freely available online so it is possible to decode them. The game provides a continuous UDP stream of telemetry data ~20k -150k data points/sec based on your selected configuration.
2. Parsing the telemetry data
The next part is to decode the UDP stream, I have done it in this f1_telemetry code in Python. The code includes the logic for decoding the raw bytes of each UDP packet & provides all the information about drivers, car telemetry, car status, lap status, session and so on.
The python app runs locally in the same network as the console where the telemetry output is available from the F1 game. You can run it on a lightweight raspberry pi or even on your laptop.
Note: I have used the specification of F1 2019, but the F1 games are backward compatible so it will work seamless even for F1 2020 & 2021 by selecting telemetry format as 2019.
3. Ingesting the telemetry data
At this point, we need to select a data store for persisting the data & performing further analysis.
Let us start by listing out all our requirements -
Continuous ingestion of car’s telemetry data with minimal latency (ideally less than a second)
High ingestion throughput (thousands of data points per second)
Ability to correlate different stats viz. car speed, brake, temperature, pressure against time (microsecond granularity)
Ability to perform real-time monitoring & analysis on large volume of incoming data
Native support for creation, manipulation & analysis of multiple time series for historic analysis (driver performance comparison)
Native integration with popular dashboarding tools
Flexibility to change schema (add, remove columns with introduction of new data points)
Based on the above requirements, Azure Data Explorer (ADX) is an ideal choice. ADX is a fast, fully managed data analytics service for real-time analysis on large volumes of data streaming from applications, websites, IoT devices, and more
ADX supports streaming ingestion, providing low latency between the data ingestion and query, while supporting high data ingestion volumes. ADX is ideal for time series analysis, helping you quickly identify patterns, anomalies, and trends in your data. Most of the analysis & monitoring needs for our F1 use case, warrants a time series data source.
To achieve a minimal ingestion latency, we will stream the data directly to ADX clusters via the client libraries. I have created an ingest microservice which uses the python sdk to stream the data to ADX.
4. Monitoring the telemetry data
The last part is the easiest. At this point, we have all the telemetry data available in ADX for querying, we need to display it in a format that makes most sense for a race engineer. We have multiple dashboarding options to choose from – ADX provides in-built web dashboards and most popular dashboarding tools viz. PowerBI, Grafana, Kibana natively support ADX as a data source.
I decided to go with Grafana for this use case as –
Grafana provides inbuilt dashboard components viz. speed gauge, & bar gauge which were more aligned to how a race engineer would visualize car stats.
Grafana allows to configure the refresh rate to less than a second, allowing a near-real time experience.
Grafana is deployed on Azure (available via Azure Marketplace) in the same region as ADX for minimal latency. After installing the ADX data source plugin, it is simple to connect to your ADX cluster via a service principal. The dashboard can be uploaded from here.
The dashboard is primarily built to showcase the telemetry of a single driver, but you can easily add other drivers to the mix to compare performances (all driver’s data is available in ADX).
The dashboard has multiple panels for showing various categories of telemetry data –
Car panel: This section shows correlation between car’s data points viz. gears & clutch, throttle & brake, speed & steering, engine temperature. The extent of throttle & brake can be used to understand driver’s braking patterns, while correlating speed & steering movement helps to understand how the driver performs against corners & turns.
The dashboard provides important insights to the telemetry engineer to understand where the car can be improved, and at which point of the circuit the driver is losing time.
Here is a simple Kusto Query language (KQL) query to visualise the Speed time-series you see in the above chart.
| where $__timeFilter(Timestamp) and VehicleIndex == PlayerCarIndex
| order by Timestamp , SessionTime
| project Speed , Timestamp
$_timeFilter() allows you to query only the past X sec/min/hr data based on the user selection in Grafana.
The speed & engine gauge adds a fun representational element to visualize the data as viewed in a driver’s dashboard. I say representational as this might not update as fast as your in-game speedometer owing to the ingestion/network latency.
Driver panel:This section highlights the driver details, lap time, position, distance etc.
Tyre panel: This section showstyre stats viz. tyre pressure, tyre temperature, tyre wear & damage.If a tyre is losing pressure or taking too much damage,the team can decide to pit.
Similarly, monitoring engine, gearbox and front/rear wing damage would provide additional insights.
Keeping an eye on the total fuel in tank is vital to ensure a successful finish.
Session panel:This is where you can visualize total time left, active cars on track & air/track temperature.
Race panel: Displays live driver standings with current lap time, lap number, grid position etc.
We managed to setup an end-to-end telemetry analytics pipeline for F1 right from ingestion into ADX, to all the way to visualizing the data. You can find the source code & set up instructions here -https://github.com/anshulsharmas/F1_ADX
You can add additional features to enhance it further. Some ideas below –
Add predictive analytics capabilities viz. when to pit, predictions on race outcomes etc.
Add historical analysis option to replay or analyse an entire race.