Jovian
Sign In

Project1 Web Scraping Mcnutrition

Organizing the Menu of McDonald's Japan for a Nutrition Analysis

alt

In this project we will collect and organize information from all products currently available in McDonald's Japan (April, 2021). Our objective is to create a table where we can identify the product name, price, size and nutritional information.

To achieve our objective we will use web scraping, a process that uses bots to extract content and data from a website. The products' information will be scraped from the company's homepage: https://www.mcdonalds.co.jp/en/quality/allergy_Nutrition/nutrient/

As we can see below, the nutrition data is already provided in a table format:

alt

Our object is to collect all the data and reorganize it in a dataframe, adding some secondary information such as product price and size. The secondary information is available in each product specific homepage. Here we can see an example of the Shrimp Filet-O:

drawing

Outline of the Project:

  1. Download the necessary information from the homepages using the command 'requests';
  2. Parse these homepage html code using BeautifulSoup;
  3. Identify the codes referent to the information of interest;
  4. Create lists that contain these data and then organize them in a data frame;
  5. Save the final data as a CSV file.

After finishing the web scraping process we expect to end up with a CSV file that contains a table similar to the one below:

alt

How to run the code:

You can execute this notebook using the "Run" button at the top of the page.
You can also make changes and save your own version of the notebook to Jovian by executing the following cells:

!pip install jovian --upgrade --quiet
import jovian
jovian.commit(project="project1-mcnutrition_correct")
[jovian] Attempting to save notebook.. [jovian] Updating notebook "matcha-coding/project1-web-scraping-mcnutrition" on https://jovian.ai [jovian] Uploading notebook.. [jovian] Uploading additional files... [jovian] Committed successfully! https://jovian.ai/matcha-coding/project1-web-scraping-mcnutrition

Before starting our work we install the libraries that will be used in this project:

!pip install beautifulsoup4 --upgrade --quiet
import requests
import pandas as pd
from bs4 import BeautifulSoup
import os
matcha-coding
Mateus Silva Chang6 months ago