Schedule a demo
Alexander Stromberg

Rather speak to us directly?

Alexander Stromberg

+31 10 880 00 80

Applying Process Mining to Formula 1

Applying Process Mining to Formula 1

If you are looking for a non-traditional application of process mining, you’ve come to the right place! In our article, we dive deep into the 2021 Formula 1 season using Celonis, comparing racing legends Max Verstappen and Lewis Hamilton again each other.

This not only shows the versatility of process mining but also the power of it. We always say “You have the data; we have the value” and this is article really shows that we mean it! Wherever there is data, we can extract insights using process mining.


06 Jul 2022

At Apolix, we think beyond the usual business processes now and then. Not only does this widen our perspective on new processes, but it also is an interesting thought exercise to force ourselves to look at things from another perspective. This article will therefore focus on Formula 1: what does a Formula 1 season look like through the eyes of Celonis?

Process Mining Is Versatile

Process mining can be applied to anything, so, also Formula 1. We always tell our customers: “You have the data; we have the value”. And in fact, it’s as simple as that: where there’s data, we can extract insights through process mining. Since process mining is all about collecting the so-called digital breadcrumbs, representing when certain activities occur within a process, we can do that with any process.

The Data

To analyze a Formula 1 season with process mining, we leveraged the fantastic capabilities of the Ergast API. The Ergast API has been a reliable source of Formula 1 data for many years and captures all the results of any Formula 1 season in history. The dataset we used consists of driver data (name, nationality, number), qualification data (year, circuit, Q1-Q3 times, qualification position), and race result data (year, circuit, finishing positions, session times).

To visualize the entire course of a Formula 1 season, we defined a case (= a single run through a process) as a single driver’s season. So, one case could, for example, be Fernando Alonso in 2016.


In 2021, Formula 1 returned with a full calendar in the middle of the pandemic. It appeared to be a thrilling season with tense rivalries, huge controversies and an amazing season finale. But, before we dive deep into any of these specific occurrences, let’s look at the entire season through the eyes of Celonis process mining. The entire 2021 season looks like this:

All images in this article have been created in Celonis by Apolix using the Ergast API

But, what do we see in this process visualization? First, the numbers mentioned with the process activities are the case counts. In other words: the number of cases (drivers) that went through the activity.

This season, 21 drivers (Kubica participated as a reserve driver) took part in races. Hence, we see 21 times the “Season start” activity. Furthermore, 19 drivers suffered a DNF; only seven drivers managed to qualify in the top 3, and six different drivers won a race.


However, the power of process mining lies not just with collecting statistics like these. The true power of process mining comes with diving deeper into the data. For example, what did the path of the 2021 Formula 1 World Champion, Max Verstappen, look like, and how did this compare to his main rival, Lewis Hamilton?

Max Verstappen (left) and Lewis Hamilton (right)

So, on the left, we see Max Verstappen’s season visualization, while on the right, we have Lewis Hamilton’s. The first thing to look at is the number of races won by both drivers. Verstappen got ten wins, while Hamilton got 8, while they had a similar number of pole positions (8). Verstappen also had a few more DNFs than Hamilton (3 vs 1), but he only finished outside the top 3 once for the rest. On the other hand, Hamilton finished four times outside the top 3 (once outside of the top 10, and three times in the top 10, but not on the podium).


Moving away from this championship battle, we can do many other fun things with process mining. For example, we can apply an analysis of a common business principle we try to mitigate through process mining: rework. Rework is the (unnecessary) repetition of certain activities within a process since often doing the same job multiple times results in a loss of efficiency.

With Formula 1, we can also run a rework analysis. For example, one of the most undesired activities in Formula 1 is a DNF (Did Not Finish). Of course, a DNF can be caused by crashes, engine failures, mechanical failures, et cetera, but they are there to be avoided. So, if we analyze, per case, how much ‘rework’ there was in terms of DNFs, we see the following distribution:

The most common DNF repetition is three times (for 5 cases). But, as you can see, there are also 2 cases in which DNFs happened six times. So let’s dive into the data: who were those drivers, and why did the DNFs happen?

The two drivers with the most DNFs were Mazepin and Russell. If we break down their DNF reasons, we get the following insights:

For both drivers, 33% of their DNFs were due to collisions. In addition, another 17% of Mazepin’s DNFs were caused by an accident, meaning that half of his retirements came from accidents/collisions. Next, Russell had an unreliable gearbox since his gearbox caused 50% of his DNFs.


Process mining allows us to look at existing processes from a new perspective. As we have just displayed: if there’s data, we have value. Not only can we map out a process by collecting digital breadcrumbs, but we can also dive deep into specific cases. For this article, we looked at the 2021 Formula 1 World Championship rivalry between Verstappen and Hamilton. We performed a rework analysis on one of the most undesirable activities for any driver: DNFs.

If process mining can be applied to Formula 1 based on (relatively scarce) public data, imagine what it can do for you and your business processes. If you’re interested in a demo, please reach out to us.