Learn practical skills, build real-world projects, and advance your career

Visualizing the Gender Gap in College Degrees

The Department of Education Statistics releases a dataset annually containing the percentage of bachelor's degrees granted to women from 1970 to 2012. The data set is broken-up into 17 categories of degrees, with each column as a separate category.

Randal Olson, a data scientist at University of Pennsylvania, has cleaned the dataset and made it available on his personal website. You can download the dataset Randal compiled here.

Randal compiled this dataset to explore the gender gap in STEM fields, which stands for science, technology, engineering, and mathematics. This gap is reported on often in the news and not everyone agrees that there is a gap.

In this project, while I aim to visualize gender gaps in college degrees, I have a secondary intention of exploring techniques used to increase the data-ink ratio in visualizations. Special mention to DataQuest, as most of the content here is from a lesson I took on the platform.

Let us now go ahead to import the necessary libraries and explore the data a bit.

%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
women_degrees = pd.read_csv('percent-bachelors-degrees-women-usa.csv')
women_degrees.head() 
women_degrees.tail()

The table above shows the percentage of women in each of the above named degrees. Subtracting these values from 100 returns the percent of men.