Sign In

Eda 120 Years Olympics Dataset

🏆120-Years Olympics Dataset: Exploratory Analysis


💡 What is Exploratory Data Analysis?

Exploratory Data Analysis refers to the fundamental process of conducting initial assessment on a dataset in order to uncover patterns, spot anomalies, test various machine learning models, and validate assumptions using statistical results and mathematical visualisations.

🛣️ EDA Project Roadmap

Having a comprehensive roadmap helps us to stay on track without getting lost or feeling exhausted as many of these projects can take anywhere between a few days to a few weeks to complete.

  • Select a large real-world dataset
  • Perform data preparation and cleaning using Pandas and Numpy
  • Now ask questions related to the topic and try to infer the answers from the dataset. (Note that this is an iterative process and you should keep updating your list as and when a question pops up in your head)
  • Perform a data analysis using Matplotlib, Seaborn, Plotly etc.
  • Ask a few more questions and try to answer these using questions about your data in the Juptyer notebook
  • Summarise your inference and write a conclusion
  • Document, publish and present the Jupyter notebook online

📊 A Brief Outline Of The Dataset

This is a comprehensive dataset of the modern-day Olympic Games, spanning and having records of games all the way from 1896, Athens Olympics to 2016, Rio Olympics.
The dataset contains 271116 instances and 15 attributes.

Each instance corresponds to an individual athlete competing in an individual Olympic event (athlete-events).
While the attribute consists of various inputs as follows (to name a few):

  • ID - Unique number for each athlete
  • Name - Athlete's name
  • Sex - M or F
  • Age - Integer
  • Height - In centimeters
  • Weight - In kilograms
  • Team - Team name
  • Medal- Gold, Silver, Bronze
Vishnu Arun6 months ago