Sign In

Project 2


Air Quality Data Analysis - India

Air pollution is the presence of substances in the atmosphere that are harmful to the health of humans and other living beings, or cause damage to the climate or to materials. There are many different types of air pollutants, such as ases (including ammonia, carbon monoxide, sulfur dioxide, nitrous oxides, methane carbon dioxide and chlorofluorocarbons), particulates (both organic and inorganic), and biological molecules. Air pollution may cause diseases allergies and even death to humans; it may also cause harm to other living organisms such as animals and food crops, and may damage the natural environment (for example, climate change, ozone depletion or habitat degradation) or built environment (for example, acid rain). Both human activity and natural processes can generate air pollution.

One of the 30 most polluted cities in the world, 21 were in India in 2019.As per a study based on 2016 data, at least 140 million people in India breathe air that is 10 times or more over the WHO safe limit and 13 of the world's 20 cities with the highest annual levels of air pollution are in India.51% of the pollution is caused by industrial pollution, 27 % by vehicles, 17% by crop burning and 5% by other sources.Air pollution contributes to the premature deaths of 2 million Indians every year. Emissions come from vehicles and industry, whereas in rural areas, much of the pollution stems from biomass burning for cooking and keeping warm. In autumn and spring months, large scale crop residue burning in agriculture fields – a cheaper alternative to mechanical tilling – is a major source of smoke, smog and particulate pollution.India has a low per capita emissions of greenhouse gases but the country as a whole is the third largest greenhouse gas producer after China and the United States.A 2013 study on non-smokers has found that Indians have 30% weaker lung function than Europeans.

About the Datasets.

The given dataset was taken from the dataset bundle present in Kaggle Datasets, Refer to this link Air Quality India(2015-2020) Kaggle Dataset to get more information about the dataset and download it from Kaggle to work with it.

With this dataset I am trying to visualize different trends in Air pollution 2015 to 2021,As Air pollution in India is a serious health issue.

The name of the Dataset used for this projects are city_day.csv and city_hour.csv. There are 29531 rows in the city_day.csv file each row containing data about a specific city.

I will be using Python 3 for this analysis, And am doing this project in Jupyter Notebook(Kaggle and Google Collab are also good options to run this notebook and work with it). The Libraries/Packages I will be using in this projects are as followed.

  • jovian (to upload, save and share the contents of my notebook)
  • numpy (as np is one of the very famous packages for working with arrays in python)
  • pandas (Is greatly used in analysis of data and making dataframe)
  • matplotlib (Lets make our Analyzation fun and interative with the visualization library matplotlib)
  • seaborn (Adding more colours into matplotlib visualization)
    collections (specialized container datatypes providing alternatives to Python's general purpose built-in containers like dict, list etc.)
  • opendatasets (A great library to fetch data from Kaggle or from its own content)
    If you want to run this notebook in your machine the steps to do so are given at the end of the Project.

Project Outline

  • Importing Packages

  • Fetching the dataset

  • Data Preparation and Cleaning

    • Data Preparation in the city_day dataframe
    • Data Preparation for city_hour dataframe
    • Creating Functions for further usage
  • Exploratory Analysis and Visualization

    • CO
    • O3
    • NOx
    • Benzene
  • Asking and Answering Questions

  • Inferences and Conclusion

  • Reference and Future Works

Ruwini Shashikala6 months ago