Learn practical skills, build real-world projects, and advance your career

Exercise 4 - Polynomial Regression

Sometimes our data doesn't have a linear relationship, but we still want to predict an outcome.

Suppose we want to predict how satisfied people might be with a piece of fruit, we would expect satisfaction would be low if the fruit was under ripened or over ripened. Satisfaction would be high in between underripened and overripened.

This is not something linear regression will help us with, so we can turn to polynomial regression to help us make predictions for these more complex non-linear relationships!

Step 1

In this exercise we will look at a dataset analysing internet traffic over the course of the day. Observations were made every hour over the course of several days. Suppose we want to predict the level of traffic we might see at any time during the day, how might we do this?

Let's start by opening up our data and having a look at it.

In the cell below replace the text <printDataHere> with print(dataset.head()), and run the code to see the data.
# This sets up the graphing configuration
import warnings
warnings.filterwarnings("ignore")
import matplotlib.pyplot as graph
%matplotlib inline
graph.rcParams['figure.figsize'] = (15,5)
graph.rcParams["font.family"] = "DejaVu Sans"
graph.rcParams["font.size"] = "12"
graph.rcParams['image.cmap'] = 'rainbow'
graph.rcParams['axes.facecolor'] = 'white'
graph.rcParams['figure.facecolor'] = 'white'
import numpy as np
import pandas as pd

dataset = pd.read_csv('Data/traffic_by_hour.csv')

print(dataset.head()) 
00 01 02 03 04 05 06 \ 0 43.606554 24.714152 9.302911 3.694417 9.324995 9.837653 7.960157 1 44.584835 19.604348 9.480832 13.476905 14.465224 6.014083 22.679671 2 33.208561 29.584181 27.207633 11.243233 12.229805 5.072605 6.111838 3 35.026655 20.367550 21.445285 7.449592 2.232115 8.104623 9.095805 4 40.163194 19.936328 18.066480 12.109940 10.878539 9.766027 19.504761 07 08 09 ... 14 15 16 \ 0 21.292098 27.714126 46.709211 ... 41.714860 38.130357 42.779751 1 18.192898 28.783762 40.113972 ... 51.364457 35.819379 53.243056 2 26.176792 35.246483 38.220432 ... 37.738029 42.104013 54.642667 3 19.499463 37.689567 33.907093 ... 32.354274 36.112366 53.821508 4 10.313875 28.509128 30.809746 ... 37.509431 54.416484 36.801343 17 18 19 20 21 22 23 0 41.304179 49.499137 43.566211 43.339814 64.096617 59.582208 42.819702 1 49.910267 45.219895 52.002619 56.817581 61.359132 50.287926 40.383544 2 49.656174 34.779641 45.305791 41.818246 61.140163 61.446353 58.811576 3 35.869990 41.830910 46.922595 42.676526 60.139054 61.639772 44.670988 4 49.216991 43.927595 40.657175 44.350371 51.909886 61.674395 46.727170 [5 rows x 24 columns]

Step 2

Next we're going to flip the data with the transpose method - our rows will become columns and our columns will become rows. Transpose is commonly used to reshape data so we can use it. Let's try it out.

In the cell below find the text <addCallToTranspose> and replace it with transpose
### 
# REPLACE THE <addCallToTranspose> BELOW WITH transpose
###
dataset_T = np.transpose(dataset)
###

print(dataset_T)
0 1 2 3 4 5 00 43.606554 44.584835 33.208561 35.026655 40.163194 49.169391 01 24.714152 19.604348 29.584181 20.367550 19.936328 24.455188 02 9.302911 9.480832 27.207633 21.445285 18.066480 12.391360 03 3.694417 13.476905 11.243233 7.449592 12.109940 10.705337 04 9.324995 14.465224 12.229805 2.232115 10.878539 6.511395 05 9.837653 6.014083 5.072605 8.104623 9.766027 21.785345 06 7.960157 22.679671 6.111838 9.095805 19.504761 19.257321 07 21.292098 18.192898 26.176792 19.499463 10.313875 23.273782 08 27.714126 28.783762 35.246483 37.689567 28.509128 29.661006 09 46.709211 40.113972 38.220432 33.907093 30.809746 34.608582 10 39.111999 46.149334 30.902951 31.018349 36.326509 38.679585 11 47.428745 43.753611 50.462422 43.379814 45.893941 48.254502 12 43.459394 45.312618 41.865849 40.330625 31.512743 44.585404 13 39.046579 34.654569 43.628736 41.798041 37.239437 33.561915 14 41.714860 51.364457 37.738029 32.354274 37.509431 39.392238 15 38.130357 35.819379 42.104013 36.112366 54.416484 54.708007 16 42.779751 53.243056 54.642667 53.821508 36.801343 48.042698 17 41.304179 49.910267 49.656174 35.869990 49.216991 36.682722 18 49.499137 45.219895 34.779641 41.830910 43.927595 47.843339 19 43.566211 52.002619 45.305791 46.922595 40.657175 45.872196 20 43.339814 56.817581 41.818246 42.676526 44.350371 41.636422 21 64.096617 61.359132 61.140163 60.139054 51.909886 54.049169 22 59.582208 50.287926 61.446353 61.639772 61.674395 53.708731 23 42.819702 40.383544 58.811576 44.670988 46.727170 55.473724