Usa Used Cars Eda Project

Exploratory Data Analysis of Used Cars Market in USA

Dataset Used: US Used Cars Dataset(3 million Cars)

USA Used Car Market Overview:

The USA Used Car market has grown at a steady growth rate over the past few years supported by an increase in population in the country along with the rising trends of upgrading cars over the years. Factors like fast-growing disposable income, rising demand for premium cars, a shorter periods of car ownership, and increasing owner preference are driving the growth of Used Car sales. Development is further supported by manufacturers’ investments in expanding the network of used car dealers, building the brand, and enabling customers to choose.

USA Used Car Market Size and Segmentation:

The USA Used Car market has grown from 2016 to 2021 due to COVID-19 being one of the biggest drivers as it compelled people to have a personal vehicle to avoid the use of public transportation for protective measures. For more than a year now, automakers have been battling through a semiconductor chip shortage that has sporadically halted production of new vehicles, causing record-low inventories of vehicles and higher prices. These circumstances have pushed many buyers into the Used-car market.

What is Exploratory Data Analysis?

Exploratory Data Analysis (EDA) is the process of exploring, investigating and gathering insights from data using statistical measures and visualizations. The objective of EDA is to develop an understanding of data, by uncovering trends, relationships, and patterns.

EDA is both a science and an art. On the one hand, it requires knowledge of statistics, visualization techniques, and data analysis tools like Numpy, Pandas, Seaborn, etc. On the other hand, it requires asking interesting questions to guide the investigation and interpreting numbers & figures to generate useful insights.

About the Project

In this project, I have selected a US Used Cars Dataset(3 million Cars), from Kaggle to explore and analyze the used cars, which manufacturers hold the market, what is the sales trend of used cars.

Tools and Libraries Used for Exploratory Data Analysis:

  • open datasets ( Jovian library to download a 'Kaggle' dataset )
  • Data Cleaning :
  1. Pandas
  2. Numpy
  • Data Visualisation :
  1. Matplotlib
  2. Seaborn
  3. Plotly
  4. Heatmap

Here's the outline of the steps we'll follow:

  1. Downloading a dataset from an online source
  2. Data preparation and cleaning with Pandas
  3. Open-ended exploratory analysis and visualization
  4. Asking and answering interesting questions
  5. Summarizing inferences and drawing conclusions

How to run the code

The easiest way to start executing the code is to click the Run button at the top of this page and select Run on Binder. You can also select "Run on Colab" or "Run on Kaggle", but you'll need to create an account on Google Colab or Kaggle to use these platforms. You can make changes and save your own version of the notebook to Jovian by executing the following cells.

Since the selected dataset contains 3 million rows of data, I have selected "Gogle Colab" to execute the code for faster response.

When you are commiting the notebook to Jovian for the first time in "Colab" it will ask for API key which will be found in your Jovian account getstarted section.

!pip install jovian --upgrade --quiet
import jovian
# Execute this to save new versions of the notebook
