Learn practical skills, build real-world projects, and advance your career
Created 4 years ago
Assignment 2: Deep N-grams
Welcome to the second assignment of course 3. In this assignment you will explore Recurrent Neural Networks RNN
.
- You will be using the fundamentals of google's trax package to implement any kind of deeplearning model.
By completing this assignment, you will learn how to implement models from scratch:
- How to convert a line of text into a tensor
- Create an iterator to feed data to the model
- Define a GRU model using
trax
- Train the model using
trax
- Compute the accuracy of your model using the perplexity
- Predict using your own model
Overview
Your task will be to predict the next set of characters using the previous characters.
- Although this task sounds simple, it is pretty useful.
- You will start by converting a line of text into a tensor
- Then you will create a generator to feed data into the model
- You will train a neural network in order to predict the new set of characters of defined length.
- You will use embeddings for each character and feed them as inputs to your model.
- Many natural language tasks rely on using embeddings for predictions.
- Your model will convert each character to its embedding, run the embeddings through a Gated Recurrent Unit
GRU
, and run it through a linear layer to predict the next set of characters.
The figure above gives you a summary of what you are about to implement.
- You will get the embeddings;
- Stack the embeddings on top of each other;
- Run them through two layers with a relu activation in the middle;
- Finally, you will compute the softmax.
To predict the next character:
- Use the softmax output and identify the word with the highest probability.
- The word with the highest probability is the prediction for the next word.
import os
import trax
import trax.fastmath.numpy as np
import pickle
import numpy
import random as rnd
from trax import fastmath
from trax import layers as tl
# set random seed
trax.supervised.trainer_lib.init_random_number_generators(32)
rnd.seed(32)
INFO:tensorflow:tokens_length=568 inputs_length=512 targets_length=114 noise_density=0.15 mean_noise_span_length=3.0
Part 1: Importing the Data
1.1 Loading in the data
Now import the dataset and do some processing.
- The dataset has one sentence per line.
- You will be doing character generation, so you have to process each sentence by converting each character (and not word) to a number.
- You will use the
ord
function to convert a unique character to a unique integer ID. - Store each line in a list.
- Create a data generator that takes in the
batch_size
and themax_length
.- The
max_length
corresponds to the maximum length of the sentence.
- The