Learn practical skills, build real-world projects, and advance your career

Density Estimation: Gaussian Mixture Models

Credits: Forked from PyCon 2015 Scikit-learn Tutorial by Jake VanderPlas

Here we'll explore Gaussian Mixture Models, which is an unsupervised clustering & density estimation technique.

We'll start with our standard set of initial imports

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

# use seaborn plotting defaults
import seaborn as sns; sns.set()

Introducing Gaussian Mixture Models

We previously saw an example of K-Means, which is a clustering algorithm which is most often fit using an expectation-maximization approach.

Here we'll consider an extension to this which is suitable for both clustering and density estimation.

For example, imagine we have some one-dimensional data in a particular distribution:

np.random.seed(2)
x = np.concatenate([np.random.normal(0, 2, 2000),
                    np.random.normal(5, 5, 2000),
                    np.random.normal(3, 0.5, 600)])
plt.hist(x, 80, normed=True)
plt.xlim(-10, 20);
Notebook Image