Learn practical skills, build real-world projects, and advance your career

Credits:

I started with these three notebooks and combined various parts of them into the current notebook:

  1. https://www.kaggle.com/ronaldokun/multilabel-stratification-cv-and-ensemble
  2. https://www.kaggle.com/aakashns/advanced-transfer-learning-starter-notebook
  3. https://www.kaggle.com/nachiket273/protein-classification-one-cycle

How to run this notebook:

  1. You can download and run this notebook.
  2. You need to download the data and put it in a data folder and this notebook in a folder called "code" (just an example).
  3. This notebook takes 12.5 hours to run with resnet18 on a single GPU.

Model architecture and other parameters:

  1. I used a single resnet18 model. Although i tried densenet121, resnet50 and resnext50, I did not have time to do stacking because i spent the time on the 10-fold cross-validation.
  2. In the final layer i tried two experiments:
    2(a). First I used a simple linear layer to replace the final layer.
    2(a) Second I used a complex classifier. In this I also replaced the adaptive pooling. (this approach is taken from protein-classification-one-cycle
  3. For both experiments, I used a 10-fold cross-validation approach. This approach is taked from
    multilabel-stratification-cv-and-ensemble.
  4. I froze the model for 20 epoch and trained it.
  5. Then I un-froze the model and trained for another 20 epochs. This approach is similar to that described in advanced-transfer-learning-starter-notebook
  6. More details can be found here: https://www.kaggle.com/c/jovian-pytorch-z2g/discussion/163666

Notes by the author of the original notebook:

"In addition to having multiple labels in each image, the other challenge in this competition is the existence of rare classes and combinations of different classes.

One technique to deal with this is to guarantee a balanced spliting between training and validation set. The usual random train_test_split is not ideal in this case because you can end up putting rare cases in the validation set and your model will never learn about them. The stratification present in the scikit-learn is also not equipped to deal with multilabel targets. The library scikit-multilearn does exactly that.

Update 1: in the previous example I've just showed how to create the splitted dataframe. This is not much help if you are not used to create datasets in Pytorch. In this version I show how to use this in conjunction with the Advanced Transfer Learning Notebook"

import os
import gc
import time
import copy
from pathlib import Path
import multiprocessing as mp
import random
import warnings
warnings.filterwarnings("ignore")

import cv2
import pandas as pd
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm



import torch
import torchvision.models as models
from torch.utils.data import Dataset, random_split, DataLoader
import torchvision.transforms as T
from sklearn.metrics import f1_score
import torch.nn.functional as F
import torch.nn as nn
from torchvision.utils import make_grid
from skmultilearn.model_selection import IterativeStratification
%matplotlib inline

      
ROOT = Path('../')
DIR = ROOT /'data/'
TRAIN = DIR / 'train'
TEST = DIR / 'test'
arch = 'resnet18'
freeze_epochs = 20
unfreeze_epochs = 20
epochs = freeze_epochs + unfreeze_epochs;
size = 512
if size == 256: batch_size = 128
if size == 512: batch_size = 64
    
nfolds = 10
threshold = 0.3
SEED = 2020
max_lr = 0.001
grad_clip = 0.1
weight_decay = 1e-4
opt_func = torch.optim.Adam
global advanced_fc;
advanced_fc = True

Helper Functions

def show_sample(img, target, invert=True):
    if invert:
        plt.imshow(1 - img.permute((1, 2, 0)))
    else:
        plt.imshow(img.permute(1, 2, 0))
    print('Labels:', decode_target(target, text_labels=True))
    
def show_batch(dl, invert=True):
    for images, labels in dl:
        fig, ax = plt.subplots(figsize=(16, 8))
        ax.set_xticks([]); ax.set_yticks([])
        data = 1-images if invert else images
        ax.imshow(make_grid(data, nrow=16).permute(1, 2, 0))
        break

def F_score(output, label, threshold=threshold, beta=1):
    prob = output > threshold
    label = label > threshold

    TP = (prob & label).sum(1).float()
    TN = ((~prob) & (~label)).sum(1).float()
    FP = (prob & (~label)).sum(1).float()
    FN = ((~prob) & label).sum(1).float()

    precision = torch.mean(TP / (TP + FP + 1e-12))
    recall = torch.mean(TP / (TP + FN + 1e-12))
    F2 = (1 + beta**2) * precision * recall / (beta**2 * precision + recall + 1e-12)
    return F2.mean(0)

def get_default_device():
    """Pick GPU if available, else CPU"""
    if torch.cuda.is_available():
        return torch.device('cuda')
    else:
        return torch.device('cpu')

def to_device(data, device):
    """Move tensor(s) to chosen device"""
    if isinstance(data, (list,tuple)):
        return [to_device(x, device) for x in data]
    return data.to(device, non_blocking=True)

class DeviceDataLoader():
    """Wrap a dataloader to move data to a device"""
    def __init__(self, dl, device):
        self.dl = dl
        self.device = device
        
    def __iter__(self):
        """Yield a batch of data after moving it to device"""
        for b in self.dl: 
            yield to_device(b, self.device)

    def __len__(self):
        """Number of batches"""
        return len(self.dl)
    
class MultilabelImageClassificationBase(nn.Module):

    def training_step(self, batch):
        images, targets = batch 
        out = self(images)   
        #out = out.type(torch.FloatTensor).cuda() 
        #targets=targets.cuda()
        #loss = F.binary_cross_entropy(out.type(torch.FloatTensor), targets)
        loss = F.binary_cross_entropy(out, targets)      
        return loss
    
    def validation_step(self, batch):
        images, targets = batch 
        out = self(images)                           # Generate predictions
        loss = F.binary_cross_entropy(out, targets)  # Calculate loss
        score = F_score(out, targets)
        return {'val_loss': loss.detach(), 'val_score': score.detach() }
        
    def validation_epoch_end(self, outputs):
        batch_losses = [x['val_loss'] for x in outputs]
        epoch_loss = torch.stack(batch_losses).mean()   # Combine losses
        batch_scores = [x['val_score'] for x in outputs]
        epoch_score = torch.stack(batch_scores).mean()      # Combine accuracies
        return {'val_loss': epoch_loss.item(), 'val_score': epoch_score.item()}
    
    def epoch_end(self, epoch, result):
        print("Epoch [{}], last_lr: {:.4f}, train_loss: {:.4f}, val_loss: {:.4f}, val_score: {:.4f}".format(
            epoch, result['lrs'][-1], result['train_loss'], result['val_loss'], result['val_score']))

def seed_everything(seed):
    random.seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = True

def encode_label(label):
    target = torch.zeros(10)
    for l in str(label).split(' '):
        target[int(l)] = 1.
    return target

def decode_target(target, text_labels=False, threshold=threshold):
    result = []
    for i, x in enumerate(target):
        if (x >= threshold):
            if text_labels:
                result.append(labels[i] + "(" + str(i) + ")")
            else:
                result.append(str(i))
    return ' '.join(result)

seed_everything(SEED)