Spotify Database Daily Charts Over 3 Years



In this project, we will analyze the Spotify Huge Database Daily Charts Over 3 Years from Kaggle.This huge dataset contains all the songs in Spotify's Daily Top 200 charts in 35+1 (global) countries around the world for a period of over 3 years (2017-2020). The data set can be viewed using this link:

Exploratory Data Analysis (EDA)


Exploratory Data Analysis is a process of examining or understanding the data and extracting insights or main characteristics of the data. EDA is generally classified into two methods, i.e. graphical analysis and non-graphical analysis.
EDA is very essential because it is a good practice to first understand the problem statement and the various relationships between the data features before getting your hands dirty.



Kaggle, a subsidiary of Google LLC, is an online community of data scientists and machine learning practitioners. Kaggle allows users to find and publish data sets, explore and build models in a web-based data-science environment, work with other data scientists and machine learning engineers, and enter competitions to solve data science challenges.

Here is an outline of the steps we'll follow:

  • Downloading a dataset from an online source Kaggle.
  • Data preparation and cleaning
  • Exploratory Analysis and Visualization.
  • Asking and Answering questions.
  • Summary
  • Inferences and conclusion.
  • Reference

"Final Database" includes many data for each song. It aggregates the populairty for songs into a single score for each. For each song several variables were retrieved by using Spotify's API (such as artist, country, genre, …)

Download and Read the Data

!pip install opendatasets --upgrade --quiet

import opendatasets as od 
download_url = ''
data_filename = './spotify-huge-database-daily-charts-over-3-years/Final database.csv'
