comment 0

2011 Silverstone Grand Prix: A visual summary

By: Srisai Sivakumar

F1 and Data:

F-1 is a fast paced, technologically advanced sport and industry. The cars are developed at breakneck speeds, with every possible technological advancement in place, may it be advanced aerodynamics and CFD capabilities (my other passion), carbon composite body or advanced control systems. With numerous Grand Prix (15 in 2015) during the season, there is an ever increasing demand to improve the performance of the car through any and all possible means.

A recent addition to the F1 team’s tool-box is Data. Vast amounts of telemetry data are streamed in real time from cars to pits and back to the ‘mission control’ during the practice, qualifying rounds and of course, during the race. Such data play a key role in developing race strategies.

It would be interesting to develop a model to predict the winner of a particular race. That would require significant insights into the sport, the circuits and what not. As a newbie to F1, I now take my first steps towards such a goal, with a simpler analysis.

The focus of this study will be (retrospective) statistical and graphical analysis of the 2011 British Grand Prix held at the Silverstone Circuit near the village of Silverstone in Northamptonshire in England.

 

The data

The most important aspect of any data analysis study like this is the quality of the data, understanding the uncertainties and error margins in the data and knowledge of the uncertainties in the measurements during the data collection.

Most of my prior work have been from ‘trusted’ sources, like the UCI Machine Learning Repository or data that I have been involved in personally.

The data to be used in this study was obtained from the F1 Data Junkie website. The author credits his data to an external site, which is a well known source of F1 data. So for the rest of the study, we assume the data is of good quality and we consider the data on its face value.

 

Approach to the study

The first step would be to convert or transforming the data into a form that would be convenient for subsequent statistical analysis and visualization. This process is called Data munging or data wrangling. Of course getting the data into the ‘correct’ format requires some intuition of the kind of analysis one expects to do with the data. Since I have done a reasonable number of such studies, I have some intuition on whats to be done.

Once we have the data in the correct format, we dive into the heart of the study. We start by exploring the practice runs, get statistical and visual summary of the events. We do the same for the qualifying round as well.

Once we reach the race, we go a bit further, explore it in a bit more detail. For brevity not all the findings shall be included in this discussion.

Analysis

Practice rounds

We begin by looking at the first practice round. We begin with a plot of the lap times for each driver and how it has changed over the course of subsequent laps.

1

 

In the next plot, we breakdown the previous plot, grouping each by the stints.

 

2

 

We now see statistical summary of the lap times for each driver during practice 1, as a box plot and then as a table.

3

 

Summary of average times for each driver during practice-1:

 

Name Time
M. WEBBER 111.7099
L. HAMILTON 111.7664
M. SCHUMACHER 111.9195
P. DI RESTA 111.9267
S. BUEMI 112.3735
J. ALGUERSUARI 112.5379
N. HULKENBERG 112.843
P. MALDONADO 112.9529
S. VETTEL 113.6637
R. BARRICHELLO 114.3434
K. KOBAYASHI 115.2425
V. PETROV 115.4703
S. PEREZ 115.8768
F. ALONSO 116.109
T. GLOCK 116.3789
N. ROSBERG 116.7239
J. BUTTON 116.8418
F. MASSA 117.6136
D. RICCIARDO 117.8241
J. D’AMBROSIO 118.1781
N. HEIDFELD 118.3045
K. CHANDHOK 118.5977
V. LIUZZI 118.6985
J. TRULLI 119.8219

 

We perform similar analysis on practice 2 and 3, but the results shall be presented in a consolidated manner for all the 3 practice rounds, not individually for practice rounds 2 and 3 for sake of brevity.

After the 3 practice rounds, It would be interesting to see how the average lap times have evolved for various drivers. Lets begin by seeing this. Then lets look at some stat summary for the 3 practice rounds.

Table of Average lap times after the 3 practice rounds:

Name Practice-1 Practice-2 Practice-3
A. SUTIL NA 120.053 99.240
D. RICCIARDO 117.824 120.272 103.939
F. ALONSO 116.109 127.082 98.155
F. MASSA 117.614 115.765 99.648
H. KOVALAINEN NA 117.427 100.835
J. ALGUERSUARI 112.538 120.190 101.390
J. BUTTON 116.842 121.723 101.142
J. D’AMBROSIO 118.178 121.400 105.693
J. TRULLI 119.822 122.324 107.408
K. CHANDHOK 118.598 NA NA
K. KOBAYASHI 115.243 118.832 98.237
L. HAMILTON 111.766 115.280 100.031
M. SCHUMACHER 111.920 116.054 100.645
M. WEBBER 111.710 115.029 98.126
N. HEIDFELD 118.305 124.562 101.117
N. HULKENBERG 112.843 NA NA
N. ROSBERG 116.724 120.579 101.438
P. DI RESTA 111.927 115.699 99.208
P. MALDONADO 112.953 128.083 101.797
R. BARRICHELLO 114.343 117.856 101.360
S. BUEMI 112.374 116.875 102.677
S. PEREZ 115.877 121.905 102.082
S. VETTEL 113.664 117.687 97.288
T. GLOCK 116.379 121.789 104.348
V. LIUZZI 118.699 121.918 103.853
V. PETROV 115.470 116.686 103.688

 

4

 

5

From the table and plots, We can very clearly see that the mean and median of lap times have decreased, by ~ 10% for Practice 3, when compared with 1 and 2. Interestingly, practice 2 has the highest average lap times for all drivers who participated in all the 3 of the runs. Its also interesting to know that Alonso and Maldonado have shown the greatest improvement going from practice 2 to 3.

Qualifying round

Explaining the rules of the qualifying rounds is beyond the scope of the study. The reader may find resources elsewhere with a quick look up. We look at the (now familiar) box plot of the lap times of all the drivers.

Lets also familiarize ourselves with a new term, the elapsed time, for each driver- by looking at a plot. It should be clear that elapsed time is the cumulative time taken for each lap.

 

6
7
It should be quite interesting to compare the results of the qualifying round with those from the practice.

 

8

9

 

The Race

On to the big one. There may be a overdose of plots in this section, so lets begin with a easy one.

Lets start by looking at when the drivers chose to pit-stop.

 

10

 

The plot presents a mixed trend. While most drivers take their first stop around the 10th lap, the choice of subsequent stops have much larger spread.

Lets look at the lap times of the drivers.

 

11

 

This gives an overall sense of the driver and team’s abilities. Clearly Alonso, Massa and Webber have more sub-100 second laps than the rest. But this doesn’t necessarily mean they would take the podium.

 

12

 

This reveals a rough overall pattern. There seem to be a relatively ‘flat’ initial stages of the ract, around 10 laps, where the lap times seem fairly constant for all the drivers. Then is a period of reducing lap times, in the stages of 10th to the 25th or 30th laps, beyond which the lap times remain fairly constant again. One might observe a few points on the curve, that is well above the average times. These are the stops.

Lets now look at more esoteric plots. Lets see which driver has had the best lap times and how it has changed over the course of the 52 laps.

 

13

 

The plot is too crowded to observe anything meaningfully. So lets look at only the drivers who completed the 52 laps.

 

14

 

Lets see which drives had the most best lap times.

 

Driver Best Lap Times
F. ALONSO 18
S. VETTEL 8
F. MASSA 6
L. HAMILTON 5
M. WEBBER 5
M. SCHUMACHER 3
A. SUTIL 2
N. HEIDFELD 1
D. RICCIARDO 0
H. KOVALAINEN 0
J. ALGUERSUARI 0
J. BUTTON 0
J. D’AMBROSIO 0
J. TRULLI 0
K. KOBAYASHI 0
N. ROSBERG 0
P. DI RESTA 0
P. MALDONADO 0
R. BARRICHELLO 0
S. BUEMI 0
S. PEREZ 0
T. GLOCK 0
V. LIUZZI 0
V. PETROV 0

 

It can be seen that Alonso was clearly the best performer, having best lap times in as much as 18 of the 52 laps, followed by Vettel, who had the best lap time in 8 laps. This gives us the first clues as to what could be a possible outcome.

Lets look at a similar plot, but with elapsed time.

15

 

This plot is lot less noisy, thus easier to read. It shows that Vettel dominated the first half of the race. At lap 27, he losses his supremacy to Alonso. Lets try to investigate the possible causes for it. A look at the race data shows that both of them had a pit-stop at the same time. But Alonso gained the lead ever since. Lets take a look at the pit stop times.

Pit stop times for Vettel = [24.818, 31.558, 23.137]

Pit stop times for Alonso = [26.566, 23.974, 23.474]

The differences in the pit stop times of Vettel and Alonso = [1.748, -7.584, 0.337]

The pit stop times are fairly consistent, with one aberration. The second pit stip for Vettel has taken much longer than the other 2 stops. And its possibly this that cost Vettel his lead on the race, from which he never recovered.

Lets see if we can come to the same conclusion by looking at the calculated time to leader metric for the drivers. To make the plot less crowded, lets consider only the top 10 drivers.

 

16

 

This reinforces our hypothesis that its most likely the extra time taken during Vettel’s second pit stop that cost him the race, given that he dominated the first half of the race.

Result

Alonso wins the race despite not being in the lead for the first half of the race, while Vettel, despite having lead the race for more than half the laps, ends up second. This could well be due to the extra time taken during Vettel’s second pit stop during lap 27.

 

17

 

The final result can be tabulated as below:

Driver Elapsed Time Rank
F. ALONSO 5321.196 1
S. VETTEL 5337.707 2
M. WEBBER 5338.143 3
L. HAMILTON 5350.182 4
F. MASSA 5350.206 5
N. ROSBERG 5381.861 6
S. PEREZ 5386.786 7
N. HEIDFELD 5396.738 8
M. SCHUMACHER 5399.108 9
J. ALGUERSUARI 5400.304 10
A. SUTIL 5400.908 11
V. PETROV 5401.877 12

 

Conclusion

Data visualization offers a convenient medium to communicate complex, high-dimensional data in the form of pictures or graphs. The visual representations helps to communicate and understand information more easily and quickly. In a fast paced and data-heavy sport like Formula-1, visualization enables strategists and decision makers to:

– ‘see’ analytical results, help find relevance among the millions of variables,

– communicate concepts and hypotheses to non-technical colleagues and upper management,

– even predict the outcomes of a future event to an extent.

With access to massive amounts of data from simulations, practices sessions and the race, along with design data, we can build models to predict the outcomes of future events, with reliability better than a random flip of an unbiased coin.

 

Leave a Reply