Learn practical skills, build real-world projects, and advance your career

Assignment 4 - Naive Machine Translation and LSH

You will now implement your first machine translation system and then you
will see how locality sensitive hashing works. Let's get started by importing
the required functions!

If you are running this notebook in your local computer, don't forget to
download the twitter samples and stopwords from nltk.

nltk.download('stopwords')
nltk.download('twitter_samples')

NOTE: The Exercise xx numbers in this assignment are inconsistent with the UNQ_Cx numbers.

import pdb
import pickle
import string

import time

import gensim
import matplotlib.pyplot as plt
import nltk
import numpy as np
import scipy
import sklearn
from gensim.models import KeyedVectors
from nltk.corpus import stopwords, twitter_samples
from nltk.tokenize import TweetTokenizer

from utils import (cosine_similarity, get_dict,
                   process_tweet)
from os import getcwd
# add folder, tmp2, from our local workspace containing pre-downloaded corpora files to nltk's data path
filePath = f"{getcwd()}/../tmp2/"
nltk.data.path.append(filePath)