Projectweb Scrapping 71ca7
Scrapping Rotten Tomatoes for Trending Netflix TV-Show's
The dataset is taken from the Rotten Tomatoes. It contains data with reference to entertainment industry. Rotten tomatoes is an online database of information related to films, television programs, home videos and video games, and internet streams, including cast, production crew and personnel biographies, plot summaries and fan reviews and ratings.
Pick a website and describe your objective
The website I have chosen is Rotten Tomatoes and I will be parsing the website to know the details of the 'Most Popular tv shows with reference to Netflix' on the platform.
Parsing Link - 'https://editorial.rottentomatoes.com/guide/best-netflix-shows-and-movies-to-binge-watch-now/'
The objective is to create a CSV File at the end which will have various information as mentioned below :
Movie Name, Summary, Year of Release, Rating, Synopsis, Lead Actors, Title, Movie Page URL
Use the requests library to download web pages
Inspect the website's HTML source and identify the right URLs to download.
Download and save web pages locally using the requests library.
Create a function to automate downloading for different topics/search queries.
Use Beautiful Soup to parse and extract information
Parse and explore the structure of downloaded web pages using Beautiful soup.
Use the right properties and methods to extract the required information.
Create functions to extract from the page into lists and dictionaries.
(Optional) Use a REST API to acquire additional information if required.