Learn practical skills, build real-world projects, and advance your career

Exploratory Data Analysis of Indian Companies Registration

alt

Exploratory Data Analysis (EDA) is the process of exploring, investigating and gathering insights from data using statistical measures and visualizations. The objective of EDA is to develop and understanding of data, by uncovering trends, relationships and patterns.

Companies play an important role in nation's growth.

Exploration is the engine that drives innovation. Innovation drives economic growth. So let's all go exploring. -Edith Widder

We will explore Indian companies registration from 1857 to 2020. Dataset that we are going to use in the project is from Kaggle.

Data contains various information like Corporate Identification Number(CIN), Company Name, Company Status, Company Class, Company Category, Authorized Capital in INR, Paid-up Capital in INR, Date of Registration, Registered State, Registrar of Companies, Principal Business Activity, Registered Office Address and Sub Category.

There are 1.9+ million rows and 17 columns.

In this project we are going to use Numpy, Pandas, Matplotlib, Seaborn, Plotly, opendatasets libraries of Python.

Here is the outline of the project:

  • Download the Data
  • Data Preparation and Cleaning
  • Exploratory Analysis and Visualization
  • Ask & Answer Questions
  • Summary
  • Conclusion
  • Future Work Ideas
  • References
# Execute this to save new versions of the notebook
jovian.commit(project="exploratory-data-analysis-project")
[jovian] Detected Colab notebook... [jovian] Please enter your API key ( from https://jovian.ai/ ): API KEY: ·········· [jovian] Uploading colab notebook to Jovian... Committed successfully! https://jovian.ai/singhalkshama4343/exploratory-data-analysis-project

First install important libraries that we are going to use i.e. Numpy for numerial calculation, Pandas for handling dataframes and Matplotlib, Plotly, Seaborn for visulization.

!pip install numpy pandas==1.1.5 wordcloud jovian opendatasets matplotlib==3.1.3 seaborn plotly folium --upgrade --quiet
|████████████████████████████████| 9.5 MB 5.4 MB/s |████████████████████████████████| 13.1 MB 34.9 MB/s |████████████████████████████████| 285 kB 41.5 MB/s |████████████████████████████████| 15.2 MB 45.5 MB/s

Importing libraries.