Explore how machine learning can predict "kicked" cars in auto auctions. Learn from the dataset, engineer features, and fine-tune classifiers to provide real value to dealerships. This project is part of Machine Learning with Python: Zero to GBMs.
One of the biggest challenges of an auto dealership purchasing a used car at an auto auction is the risk of that vehicle might have serious issues that prevent if from being sold to customers.The auto community calls these unfortunate purchases 'kicks.
Kicked cars often result when there are tampered odometers, mechanical issues the dealer is not able to address, issues with getting the vehicle title from the seller or some other unforseen problem.Kick cars can be very costly to dealers after transportation cost, throw-away repair work and market losses in reselling the vehicle.
Modelers who can figure out which cars have a hihger risk of being kicked can provide real value to dealership trying to provide the best inventory selection possible to their customers.
What if i said, exploratory data analysis and machine learning can be used to provide the much needed insights on this vexing issue.
In this notebook, i will use several machine learning algorithms to predict which of the cars purchased during auction are bad buys.
The first step in achieving this, is to carry out an exploratory data analysis on the datasets which was gotten from Kaggle.Then, the features will be engineered to train and fine-tune machine learning classifier algorithms.
This project is part of the Machine Learning with Python:Zero to GBMs.
Field Name Definition
- RefID = Unique (sequential) number assigned to vehicles
- IsBadBuy = Identifies if the kicked vehicle was an avoidable purchase
- PurchDate = The Date the vehicle was Purchased at Auction
- Auction = Auction provider at which the vehicle was purchased
- VehYear = The manufacturer's year of the vehicle
- VehicleAge= The Years elapsed since the manufacturer's year
- Make = Vehicle Manufacturer
- Model = Vehicle Model
- Trim = Vehicle Trim Level
- SubModel = Vehicle Submodel
- Color = Vehicle Color
- Transmission = Vehicles transmission type (Automatic, Manual)
- WheelTypeID = The type id of the vehicle wheel
- WheelType = The vehicle wheel type description (Alloy, Covers)
- VehOdo = The vehicles odometer reading
- Nationality = The Manufacturer's country
- Size = The size category of the vehicle (Compact, SUV, etc.)
- TopThreeAmericanName = Identifies if the manufacturer is one of the top three American manufacturers
- MMRAcquisitionAuctionAveragePrice= Acquisition price for this vehicle in average condition at time of purchase
- MMRAcquisitionAuctionCleanPrice = Acquisition price for this vehicle in the above Average condition at time of purchase
- MMRAcquisitionRetailAveragePrice= Acquisition price for this vehicle in the retail market in average condition at time of purchase
- MMRAcquisitonRetailCleanPrice = Acquisition price for this vehicle in the retail market in above average condition at time of purchase
- MMRCurrentAuctionAveragePrice = Acquisition price for this vehicle in average condition as of current day
- MMRCurrentAuctionCleanPrice = Acquisition price for this vehicle in the above condition as of current day
- MMRCurrentRetailAveragePrice = Acquisition price for this vehicle in the retail market in average condition as of current day
- MMRCurrentRetailCleanPrice = Acquisition price for this vehicle in the retail market in above average condition as of current day
- PRIMEUNIT = Identifies if the vehicle would have a higher demand than a standard purchase
- AcquisitionType = Identifies how the vehicle was aquired (Auction buy, trade in, etc)
- AUCGUART = The level guarntee provided by auction for the vehicle (Green light - Guaranteed/arbitratable, Yellow * Light - caution/issue, red light - sold as is)
- KickDate = Date the vehicle was kicked back to the auction
- BYRNO = Unique number assigned to the buyer that purchased the vehicle
- VNZIP = Zipcode where the car was purchased
- VNST = State where the the car was purchased
- VehBCost = Acquisition cost paid for the vehicle at time of purchase
- IsOnlineSale = Identifies if the vehicle was originally purchased online
- WarrantyCost = Warranty price (term=36month and millage=36K)
1. Download Don'tGetKicked Data
There are several ways one can get data. It can be from scraping of web using the beautifulSoup libraries or downloading the data directly from the source.For this project, my source of data is Kaggle which gives exclusive downloading right to only its members(those who registered with them). This makes it imperative for me to use the library OpenDataSets.This library by Jovian makes it a convenient downloading tools. It requires the user to input their Kaggle Username and API token to enable one to have access to the data.
2.Reading Of Data
In this section, i will the os module and pandas library to read the data.The os.Listdir() method is used to get the list of all files and directories in the specified directory downloaded fro Kaggle.For pandas, we will use pd.read_csv() which is used to read csv files.
import os import opendatasets as od import pandas as pd pd.set_option('display.max_columns', 120) pd.set_option('display.max_rows', 120) import jovian import numpy as np