Formula One - Races, Drivers, Countries and Teams. An exploratory analysis.
Formula One is the most prestigious auto-racing competition in the world. Apart from testing pure racing skills of the driver, what sets F1 apart is testing the automotive engineering skills of each team as every F1 car is different and constantly evolving. Since, 1950 Formula One Grand Prixes have attracted millions of spectators and sponsorships/advertisements from the largest global brands.
Being an F1 enthusiast myself, this particular dataset seemed the most exciting to analyse. This exploratory analysis largely covers the following sub-topics:
- Formula One Races throughout history.
- Formula One host nations.
- Formula One Drivers.
- Formula One Teams.
The exploratory analysis seeks to address the following questions:
- How have races per season increased from 1950 to 2017? (Line Plot)
- How many countries have hosted F1 races?
- Which countries have hosted the most F1 races in history? (Bar plot of all the host nations and the number of races hosted)
- How are the final positions of the top two modern drivers (Lewis Hamilton and Sebastian Vettel) distributed? (Histogram)
- How have season-wide points scored by top F1 drivers varied over the years? This shows at what point in time was a driver performing in comparison to other fellow drivers.
- Who are the drivers with most points in F1 history? How do they compare? Which driver tops the list? (Bar-cum-line plot)
- How have the points scored by top constructors (teams) varied over the years? (Heatmap)
- Which nationality has secured most F1 points in history? (Bar-cum-line plot)
- How does the grid position influence the final race position? (Reg-Plot)
The following libraries/modules were used in the analysis:
i. Numpy
ii. Pandas
iii. Matplotlib.pyplot
iv. Matplotlib
v.Seaborn
In total 5 questions and 8 charts have been plotted.
The charts plotted include Bar Graphs, Histogram, Line Plots, Heatmap and a Regplot.
I would finally like to thank Jovian and Aakash for curating this free course. I must admit that having no prior experience with Python, I was a bit skeptical whether I would be able to catch up with the course content. However, this course completely rendered my expectations false. This course is highly recommended for beginners like me looking to learn Data Analytics. Looking forward to my next course from Jovian.
How to run the code
This is an executable Jupyter notebook hosted on Jovian.ml, a platform for sharing data science projects. You can run and experiment with the code in a couple of ways: using free online resources (recommended) or on your own computer.
Option 1: Running using free online resources (1-click, recommended)
The easiest way to start executing this notebook is to click the "Run" button at the top of this page, and select "Run on Binder". This will run the notebook on mybinder.org, a free online service for running Jupyter notebooks. You can also select "Run on Colab" or "Run on Kaggle".
Option 2: Running on your computer locally
-
Install Conda by following these instructions. Add Conda binaries to your system
PATH
, so you can use theconda
command on your terminal. -
Create a Conda environment and install the required libraries by running these commands on the terminal:
conda create -n zerotopandas -y python=3.8
conda activate zerotopandas
pip install jovian jupyter numpy pandas matplotlib seaborn opendatasets --upgrade
- Press the "Clone" button above to copy the command for downloading the notebook, and run it on the terminal. This will create a new directory and download the notebook. The command will look something like this:
jovian clone notebook-owner/notebook-id
- Enter the newly created directory using
cd directory-name
and start the Jupyter notebook.
jupyter notebook
You can now access Jupyter's web interface by clicking the link that shows up on the terminal or by visiting http://localhost:8888 on your browser. Click on the notebook file (it has a .ipynb
extension) to open it.
Downloading the Dataset
Here I have downloaded the formula-1-race-data-1950201 from the recommended datasets offered in the course. Although most csv files were able to be imported from the new directory created, some files such as drivers.csv were being read from the directory, so the data was downloaded by simply using the data url from kaggle.
!pip install jovian opendatasets --upgrade --quiet
Let's begin by downloading the data, and listing the files within the dataset.