Learn practical skills, build real-world projects, and advance your career

AI for Medicine Course 1 Week 1 lecture exercises

Data Exploration

In the first assignment of this course, you will work with chest x-ray images taken from the public ChestX-ray8 dataset. In this notebook, you'll get a chance to explore this dataset and familiarize yourself with some of the techniques you'll use in the first graded assignment.

U-net Image

The first step before jumping into writing code for any machine learning project is to explore your data. A standard Python package for analyzing and manipulating data is pandas.

With the next two code cells, you'll import pandas and a package called numpy for numerical manipulation, then use pandas to read a csv file into a dataframe and print out the first few rows of data.

# Import necessary packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import os
import seaborn as sns
sns.set()
# Read csv file containing training datadata
train_df = pd.read_csv("nih/train-small.csv")
# Print first 5 rows
print(f'There are {train_df.shape[0]} rows and {train_df.shape[1]} columns in this data frame')
train_df.head()
There are 1000 rows and 16 columns in this data frame

Have a look at the various columns in this csv file. The file contains the names of chest x-ray images ("Image" column) and the columns filled with ones and zeros identify which diagnoses were given based on each x-ray image.