Sign In

Used Car Price Prediction Machine Learning Project

Used Car Price Prediction



This project is attempting to use supervised machine learning to predict the price of a used car given certain inputs such as the manufacturer, number of cylinders. That is essentially the core of machine learning as a concept; we take a set of data, sanitize it to be readable for a computer, and give it to a machine learning algorithm to produce a result to predict the future.

There are two forms of supervised machine learning problems: Regression and Classification.

Regression is producing an output number or numbers given a variable number of inputs such as date, weather, price, depending on the dataset

Classification on the other hand is more often a yes or no question such as does this person have X medical condition, given a variable number of inputs such as age, weight, height, once again depending on the dataset.

This project is a regression problem as it requires a single number as the output; the predicted price of a used car given certain inputs.

Project Outline

  1. Install and Import Required Libraries
  2. Download the Dataset
  3. Column Description
  4. Cleaning Data
  5. Exploratory Data Analysis
  6. Feature Engineering
  7. Training/Test/Validation
  8. Imputation, Scaling and Encoding
  9. Dumb/Benchmark Model - Mean Value
  10. Machine Learning Models
  11. Comparing Results
  12. Conclusion and Further Improvements

Libraries Used

1. Install and Import Required Libraries

Section Outline

  • Install required libraries
  • Import libraries
  • set options for matplotlib and pandas
# Install all of the libraries 
!pip install numpy pandas-profiling matplotlib seaborn opendatasets --quiet
# Import all of the libraries to be used in the project
import os
import matplotlib
import opendatasets as od
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LinearRegression, Ridge, Lasso, ElasticNet
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import MinMaxScaler
from sklearn.tree import DecisionTreeRegressor, plot_tree
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import RandomizedSearchCV
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import mean_absolute_error
from sklearn.metrics import mean_absolute_percentage_error
from xgboost import XGBRegressor, plot_tree
%matplotlib inline

# Set options for matplotlib and pandas
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (10, 6)
matplotlib.rcParams['figure.facecolor'] = '#00000000'
pd.options.display.max_columns = 100
pd.options.display.max_rows = 50
# Save the project
[jovian] Detected Colab notebook... [jovian] Uploading colab notebook to Jovian... Committed successfully!
Srinath Nanduri6 months ago