Sign In

Assignment 4 Speech Command Recognition With M3 M5 M11 M18 Cnn Networks

Assignment 4 - Speech Command Recognition with M3, M5, M11, M18 CNN networks - 30 Epoch

Assignment 4 - Final Course Project for ZeroToGANS course on the implementation of neural networks using PyTorch.

This notebook implements speech command recognition using convolutional neural networks trained on the Google SpeechCommand dataset.

The networks are based on a 3, 5, 11 or 18 layer architecture convolutional neural networks (M3, M5, M11, M18) as described in this Very Deep Convolutional Neural Networks For Raw Waveforms paper. The networks are trained on the time domain waveform inputs of the SpeechCommand dataset.

The dataset is part of the Pytorch common datasets []. There is more information on the dataset in this Speech Commands paper. The dataset consists of more than 105,000 WAVE audio files of various speakers saying thirtyfive different words such as "yes", "no", "up", "down", "left", "right", "on", "off", "stop", "go" and numerical digits 0-9. Similarly to MNIST dataset for images, using the SpeechCommand dataset enables us to understand and work with techniques involved in audio processing and recognition.

project_name='Assignment 4 - Speech Command Recognition with M3, M5, M11, M18 CNN networks'
# Uncomment the following line to run in Google Colab

# CPU:
#!pip install torch==1.7.0+cpu torchvision==0.8.1+cpu torchaudio==0.7.0 -f

# GPU:
!pip install torch==1.7.0+cu101 torchvision==0.8.1+cu101 torchaudio==0.7.0 -f

import os

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import torchaudio

import matplotlib.pyplot as plt
import IPython.display as ipd
from tqdm.notebook import tqdm
Looking in links: Requirement already satisfied: torch==1.7.0+cu101 in /usr/local/lib/python3.6/dist-packages (1.7.0+cu101) Requirement already satisfied: torchvision==0.8.1+cu101 in /usr/local/lib/python3.6/dist-packages (0.8.1+cu101) Collecting torchaudio==0.7.0 Downloading (7.6MB) |████████████████████████████████| 7.6MB 11.9MB/s Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (1.19.5) Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (0.16.0) Requirement already satisfied: typing-extensions in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) ( Requirement already satisfied: dataclasses in /usr/local/lib/python3.6/dist-packages (from torch==1.7.0+cu101) (0.8) Requirement already satisfied: pillow>=4.1.1 in /usr/local/lib/python3.6/dist-packages (from torchvision==0.8.1+cu101) (7.0.0) Installing collected packages: torchaudio Successfully installed torchaudio-0.7.0
/usr/local/lib/python3.6/dist-packages/torchaudio/backend/ UserWarning: "sox" backend is being deprecated. The default backend will be changed to "sox_io" backend in 0.8.0 and "sox" backend will be removed in 0.9.0. Please migrate to "sox_io" backend. Please refer to for the detail. '"sox" backend is being deprecated. '
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Gerhard T6 months ago