Learn practical skills, build real-world projects, and advance your career

Emojify!

Welcome to the second assignment of Week 2. You are going to use word vector representations to build an Emojifier.

Have you ever wanted to make your text messages more expressive? Your emojifier app will help you do that.
So rather than writing:

"Congratulations on the promotion! Let's get coffee and talk. Love you!"

The emojifier can automatically turn this into:

"Congratulations on the promotion! 👍 Let's get coffee and talk. ☕️ Love you! ❤️"

  • You will implement a model which inputs a sentence (such as "Let's go see the baseball game tonight!") and finds the most appropriate emoji to be used with this sentence (⚾️).

Using word vectors to improve emoji lookups

  • In many emoji interfaces, you need to remember that ❤️ is the "heart" symbol rather than the "love" symbol.
    • In other words, you'll have to remember to type "heart" to find the desired emoji, and typing "love" won't bring up that symbol.
  • We can make a more flexible emoji interface by using word vectors!
  • When using word vectors, you'll see that even if your training set explicitly relates only a few words to a particular emoji, your algorithm will be able to generalize and associate additional words in the test set to the same emoji.
    • This works even if those additional words don't even appear in the training set.
    • This allows you to build an accurate classifier mapping from sentences to emojis, even using a small training set.

What you'll build

  1. In this exercise, you'll start with a baseline model (Emojifier-V1) using word embeddings.
  2. Then you will build a more sophisticated model (Emojifier-V2) that further incorporates an LSTM.

Updates

If you were working on the notebook before this update...
  • The current notebook is version "2a".
  • You can find your original work saved in the notebook with the previous version name ("v2")
  • To view the file directory, go to the menu "File->Open", and this will open a new tab that shows the file directory.
List of updates
  • sentence_to_avg
    • Updated instructions.
    • Use separate variables to store the total and the average (instead of just avg).
    • Additional hint about how to initialize the shape of avg vector.
  • sentences_to_indices
    • Updated preceding text and instructions, added additional hints.
  • pretrained_embedding_layer
    • Additional instructions to explain how to implement each step.
  • Emoify_V2
    • Modifies instructions to specify which parameters are needed for each Keras layer.
    • Remind users of Keras syntax.
    • Explanation of how to use the layer object that is returned by pretrained_embedding_layer.
    • Provides sample Keras code.
  • Spelling, grammar and wording corrections.

Let's get started! Run the following cell to load the package you are going to use.

import numpy as np
from emo_utils import *
import emoji
import matplotlib.pyplot as plt

%matplotlib inline

1 - Baseline model: Emojifier-V1

1.1 - Dataset EMOJISET

Let's start by building a simple baseline classifier.

You have a tiny dataset (X, Y) where:

  • X contains 127 sentences (strings).
  • Y contains an integer label between 0 and 4 corresponding to an emoji for each sentence.
alt
**Figure 1**: EMOJISET - a classification problem with 5 classes. A few examples of sentences are given here.

Let's load the dataset using the code below. We split the dataset between training (127 examples) and testing (56 examples).