Jovian
Sign In

Projectweb Scrapping 71ca7

Scrapping Rotten Tomatoes for Trending Netflix TV-Show's

The dataset is taken from the Rotten Tomatoes. It contains data with reference to entertainment industry. Rotten tomatoes is an online database of information related to films, television programs, home videos and video games, and internet streams, including cast, production crew and personnel biographies, plot summaries and fan reviews and ratings.

alt

Pick a website and describe your objective

The website I have chosen is Rotten Tomatoes and I will be parsing the website to know the details of the 'Most Popular tv shows with reference to Netflix' on the platform.
Parsing Link - 'https://editorial.rottentomatoes.com/guide/best-netflix-shows-and-movies-to-binge-watch-now/'

The objective is to create a CSV File at the end which will have various information as mentioned below :

Movie Name, Summary, Year of Release, Rating, Synopsis, Lead Actors, Title, Movie Page URL

Use the requests library to download web pages

  • Inspect the website's HTML source and identify the right URLs to download.

  • Download and save web pages locally using the requests library.

  • Create a function to automate downloading for different topics/search queries.

Use Beautiful Soup to parse and extract information

  • Parse and explore the structure of downloaded web pages using Beautiful soup.

  • Use the right properties and methods to extract the required information.

  • Create functions to extract from the page into lists and dictionaries.

  • (Optional) Use a REST API to acquire additional information if required.

bittujha1997
Aditya kumar jha6 months ago