Nyc Taxi Ride Time Prediction V3
jovian.commit()
[jovian] Detected Colab notebook...
[jovian] Please enter your API key ( from https://jovian.ai/ ):
API KEY:
Predicting the Duration of a Taxicab Ride in NYC

Taxicabs are a very important means of transportation throughout many cities and New York City in particular. This notebook anaylzes data gathered in 2016 and originally published by the NYC Taxi and Limousine Commission (TLC) available here and made into a competition posted on Kaggle.com which lasted from 7/2017 to 9/2017. In this notebook I provide various models which come close to the highest scoring entries. The training dataset contained the following fields:
id
- a unique identifier for each tripvendor_id
- a code indicating the provider associated with the trip recordpickup_datetime
- date and time when the meter was starteddropoff_datetime
- date and time when the meter was stoppedpassenger_count
- the number of passengers in the vehicle (driver entered value)pickup_longitude
- the longitude where the meter was startedpickup_latitude
- the latitude where the meter was stoppeddropoff_longitude
- the longitude where the meter was starteddropoff_latitude
- the latitude where the meter was stoppedstore_and_fwd_flag
- This flag indicated whether the trip record was held in vehicle memory before sending to the vendor because the vehicle did not have a connection to the server:Y
indicates a store and forward trip;N
indicates a trip which was not a store and forward triptrip_duration
- duration of the trip in seconds
The test data set provided by Kaggle contains the same fields except dropoff_datetime
and trip_duration
.
This notebook proceeds following the outline:
- Download the datasets from Kaggle
- Prepare and divide the dataset for training and validation
- Create and establish a baseline model
- Introduce new features
- Create linear models, decision trees, and random forests with various hyperparameters
Download the datasets
Ari Blinder6 months ago