Scraping Yify Movies Lists
Scraping Movie Details from YIFY Movies
Web Scraping
- Web Scraping is a way to extract information (or simply data) from webpages using various tools and techniques. You can read more about web scraping here.
Why Web Scraping?
-
There are many websites that contain certain types of data which may prove to be invaluable in-terms of day-to-day needs, academic-research, industry-use, bussiness, etc
-
Stock-rates, product detials, sports stats, weather forecasts, movie-ratings and what not.
YIFY Movies:
- YIFY Movies , a website that offers free to download movie torrent links, having an enormous database for movies and documentaries.
- We would like to extract movie details (like title, year, genre, rating, movie_link, synopsis and no. of times downloaded) for our project.
Tools
- Pyhton 3.7 and above along with Juypter Notebooks.
- Pandas library to create dataframe as well as saving the output to .csv file.
- Beautiful Soup for parsing the html_page.
Outline:
Here's an outline of the steps we'll follow:
- Download the webpage using the
requests
- Parse the HTML source code using beautiful soup
- Searching 'tags' containing data for movie title, year, genre, rating, movie-url, synopsis and number of times downloaded.
- Scrap from multiple pages (in our case 20 pages) and compile the information into Python lists and dictionaries.
- Save the extracted information to a CSV file.
By the time we finsih our project, we would have a CSV file created in the following format:
Movie,Year,Genre,Ratings,Url,Synopsis,Downloaded
Whale Hunting,1984,Drama,6.5 / 10,https://yts.rs/movie/whale-hunting-1984," A disillusioned student meets a eccentric beggar and a mute prostitute he falls in love with. Together, without money, they cross South Korea to help the girl go home. "," Downloaded 101 times Sep 27, 2021 at 09:08 PM
........
How to Run the Code
You can execute the code using the "Run" button at the top of this page and selecting "Run on Binder". You can make changes and save your own version of the notebooks to Jovian by executing the following code cells:
!pip install jovian --upgrade --quiet
import jovian
Rahul Pandey6 months ago