Learn practical skills, build real-world projects, and advance your career

Analysis of Exoplanets

A. Introduction

The aim of this project is to give some insight about exoplanets, which are those planets that are beyond our Solar System. It will answer some fundamental questions about the typical parameters available for both the planets and the host star they belong to. Once this analysis has been performed, it can be used as the starting point of more detailed studies which include, for instance, some machine learning techniques which will allow to extract missing data from those stellar systems which may need it.

The tools used to develop this work are: Jupyter notebook, the programming language Python, and all the necessary libraries to perform the data analysis (Pandas, Numpy, Matplotlib, etc.). This analysis belongs to the final project of the online course called Data Analysis with Python: Zero to Pandas, whose objective is to teach the basics of the work that a data analyst usually performs as a professional.

B. About the dataset

The dataset chosen can be found in the link NASA exoplanet archive. This website contains information about hundreds of parameters from thousands of exoplanets. As it can be read from the web, the table of data is an update of a previous version, and it is in beta release. Some of the functionalities available are: download the data (in different formats and choosing the rows and columns), plot graphs of the selected data (very useful to perform an introductory analysis) and view the documentation. This last functionality leads us to another link in which we can see in details the definitions of all the possible parameters.

Since the website offers a huge quantity of data to analyze, it will be necessary to clean and prepare the data so that it adjusts to the objectives of the project. As a first step, let's upload our Jupyter notebook to Jovian.ml.

project_name = "ZeroToPandas-CourseProject-Exoplanets"
!pip install jovian --upgrade -q
import jovian
jovian.commit(project=project_name, files=['exoplanets_filtered.csv', 'columns_description.csv'])
[jovian] Attempting to save notebook..