Eda Project On World Wide Covid Vaccination
Covid Vaccination Data Analysis
Hello and welcome to my project notebook!
The COVID-19
outbreak has shaken the global health system and economy by its roots. This epidemic is continuously spreading and showing no signs of slowing down. Vaccination could be the only effective and economical means to control or stop this pandemic. Many research institutions and pharmaceutical companies worldwide are currently involved in the development for a suitable coronavirus vaccine. The efforts on coronavirus vaccine began initially in China as soon as the outbreak of coronavirus erupted and then world-over as the disease was declared a pandemic by WHO. Eventually, each country got into the race of developing the vaccine to be 1st in the world to safeguard its population & have an advantage over other countries. On 2 December 2020
, the United Kingdom's Medicines and Healthcare products Regulatory Agency (MHRA)
gave temporary regulatory approval for the Pfizer–BioNTech
vaccine, becoming the first country to approve this vaccine.
Here in this notebook I picked a dataset containing details regarding the day-wise Covid-19 vaccinations in different countries. Till now around 2223
countries has started vaccination to save their people.
I found this dataset on kaggle, If you want to see the dataset on kaggle click here.
Let's first talk a little bit about the dataset:
- The dataset contains more than
71,815
rows and 15 columns (this count of rows is on 24th January, 2022)- columns in the dataset has indexed as
country
,date
,total_vaccinations
,people_vaccinated
,people_fully_vaccinated
,daily_vaccinations_raw
,daily_vaccinations
,vaccines
,source
etc,(But we are interested in few them so I'll not talk about remaining columns.)- Let's talk about some specific columns:
total_vaccination
: This is the absolute number of total immunizations in the countrypeople_vaccinated
: A person, depending on the immunization scheme, will receive one or more (typically 2) vaccines; at a certain moment, the number of vaccination might be larger than the number of peoplepeople_fully_vaccinated
: This is the number of people that received the entire set of immunization according to the immunization scheme (typically 2); at a certain moment in time, there might be a certain number of people that received one vaccine and another number (smaller) of people that received all vaccines in the scheme.daily_vaccinations_raw
: For a certain data entry, the number of vaccination for that date/country.daily_vaccinations
: For a certain data entry, the number of vaccination for that date/country.
Note: For details of rest of the columns please visit the page, link is available above.
Outline:
We will finish our project in four steps as follows:
1 Data Downloading
We will install all the required libraries and download the dataset.
2 Data Preparation & Cleaning
We will start checking whether the dataset is clean or not, like if there are duplicate entries, missing values or any other misguiding data, which may lead us to bad results.
3 Visualization
We will start analyze the dataset, with some visualization on different columns, try to set relationship between columns and make inferences
4 Q & A
We will try to answer some interesting questions based on the data available and what a person can ask in general.
Step 1: Data Downloading
Let's first start with installing and importing required libraries and modules, that we are going to use in this entire project notebook.
- We will install
Numpy
library for mathematical computations, - We will install
Pandas
library, as we will do our whole analysis on the entire dataset using pandas dataframe, - We will install
jovian
library to keep our a copy of our notebook on jovian platform, - We will install
opendatasets
library for downloading the dataset from the kaggle - We will install
Plotly
to create some visualization
And finally we will import some useful modules from these libraries.
!pip install pandas==1.1.5 --quiet
#Installing pandas library