Validation and Model Selection

Credits: Forked from PyCon 2015 Scikit-learn Tutorial by Jake VanderPlas

In this section, we'll look at model evaluation and the tuning of hyperparameters, which are parameters that define the model.

from __future__ import print_function, division

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

# Use seaborn for plotting defaults
import seaborn as sns; sns.set()

Validating Models

One of the most important pieces of machine learning is model validation: that is, checking how well your model fits a given dataset. But there are some pitfalls you need to watch out for.

Consider the digits example we've been looking at previously. How might we check how well our model fits the data?

from sklearn.datasets import load_digits
digits = load_digits()
X = digits.data
y = digits.target