Web Scraping Final
Scraping Top Places for Domestic Tourism in India using Python
Holidify provides the different places for tourism, best time to visit,the overall package cost, different ways to reach the place and the VISA policies for foreign tourism.
Web scraping is the process of extracting and parsing data from websites in an automated fashion using a computer program. It's a useful technique for creating datasets for research and learning.
The page (https://www.holidify.com/country/india/places-to-visit.html) provides the top tourist places. In this project, we'll retrieve information from this page using web scraping: the process of extracting information from a website in an automated fashion using code.We'll use the Python libraries
beautifulsoup4 to scrape data from this page.
Here is an outline of the steps we"ll follow:
- Download the webpage using
- Parse the HTML source code using beautiful soup.
- Extract information about the places from the page.
- Compile extracted information into Python lists and dictionaries.
- Extract data from multiple pages.
- Save the extracted information to a CSV file.
By the end of the project, we'll create a CSV file in the following format:
place, rating, time, description, url Manali, Manali xyz, https://www.holidify.com/places/manali/ Leh Ladakh, Leh Ladakh new xyz, https://www.holidify.com/places/ladakh/ ...
How to run the code
You can execute the code using the 'Run' button at the top of this page and selecting 'Run on Binder'. You can make changes and save your own version of the notebook to Jovian by executing the following cells:
!pip install jovian --upgrade --quiet
# Execute this to save new versions of the notebook jovian.commit(project="web-scraping-final")
[jovian] Updating notebook "aakankshaat285/web-scraping-final" on https://jovian.ai [jovian] Committed successfully! https://jovian.ai/aakankshaat285/web-scraping-final