Learn practical skills, build real-world projects, and advance your career

Calculating the Bilingual Evaluation Understudy (BLEU) score: Ungraded Lab

In this ungraded lab, we will implement a popular metric for evaluating the quality of machine-translated text: the BLEU score proposed by Kishore Papineni, et al. In their 2002 paper "BLEU: a Method for Automatic Evaluation of Machine Translation", the BLEU score works by comparing "candidate" text to one or more "reference" translations. The result is better the closer the score is to 1. Let's see how to get this value in the following sections.

Part 1: BLEU Score

1.1 Importing the Libraries

We will first start by importing the Python libraries we will use in the first part of this lab. For learning, we will implement our own version of the BLEU Score using Numpy. To verify that our implementation is correct, we will compare our results with those generated by the SacreBLEU library. This package provides hassle-free computation of shareable, comparable, and reproducible BLEU scores. It also knows all the standard test sets and handles downloading, processing, and tokenization.

import numpy as np                  # import numpy to make numerical computations.
import nltk                         # import NLTK to handle simple NL tasks like tokenization.
nltk.download("punkt")
from nltk.util import ngrams
from collections import Counter     # import the Counter module.
!pip3 install 'sacrebleu'           # install the sacrebleu package.
import sacrebleu                    # import sacrebleu in order compute the BLEU score.
import matplotlib.pyplot as plt     # import pyplot in order to make some illustrations.
[nltk_data] Downloading package punkt to /home/jovyan/nltk_data... [nltk_data] Package punkt is already up-to-date!
Requirement already satisfied: sacrebleu in /opt/conda/lib/python3.7/site-packages (1.4.12) Requirement already satisfied: portalocker in /opt/conda/lib/python3.7/site-packages (from sacrebleu) (1.7.1) Requirement already satisfied: mecab-python3==0.996.5 in /opt/conda/lib/python3.7/site-packages (from sacrebleu) (0.996.5) WARNING: You are using pip version 20.1.1; however, version 20.2.3 is available. You should consider upgrading via the '/opt/conda/bin/python3 -m pip install --upgrade pip' command.