Project 1 Rightmove With Functions E2449
How To Purchase Property, Using Web Scraping on Rightmove
Rightmove is a popular property website in the United Kingdom; it is the country's largest online real estate portal and property website. This website currently lists tens of thousands of properties. Each property listing includes the price, the number of bedrooms, the location, and the contact information for the estate agent who will arrange a viewing for the prospective buyer.
The rightmove website allows users to search for properties by postcode or area. Additional filters can be applied by the website visitor to meet their specific requirements. For instance, a price range, a property type, a search radius, and the number of bedrooms.
The page https://www.rightmove.co.uk/property-for-sale/find.html?locationIdentifier=REGION%5E1152 contains all available properties in Rugby, a small market town in eastern Warwickshire. In this project. we'll retrive information from multiple pages using web scraping: the process of extracting information from a website in an automated fashion using code. We'll use Python libraries Requests and Beautiful Soup to scarp data from this page. We will then use other libraries like pandas,matplotlib and seaborn to analyse data.
Here's an outline of the steps we'll follow:
- Download the webpage using Requests
- Parse the HTML source code using Beautiful Soup
- Extract property information and complie extracted information into Python lists and dictionaries
- Save the extracted information to CSV file.
- Data analysis using pandas,matplotlib and seaborn
By the end of the project, we'll create a csv file in the following format:
How to Run the Code
You can execute the code using the 'Run' buttom at the top of this page and selecting "Run the Binder". You can make changes and save your version of the notebook to Jovian[https://www.jovian.ai/] by executing the following cells:
The default setting for the code is 294 properties and Rugby as the region, but I also gathered some regional codes for the surrounding area, which can be easily found on the rightmove website.
- Rugby : 5E1152
- Coventry : 5E368
- Milton Keynes : 5E940
- Leicester : 5E789
- Northampton : 5E1014
We gathered data in the following four segments by utilising the below primary variables and custom formulas. The primary advantage of this approach is that it allows for time savings on repetitive tasks.
location = '5E1152' base_url = 'https://www.rightmove.co.uk/property-for-sale/find.html?locationIdentifier=REGION%5E1152&index=' properties_per_page = 24 num_properties = 294 csv_filename = 'filename.csv'
# requests and beautifulsoup library used for Web Scraping !pip install requests --upgrade --quiet !pip install beautifulsoup4 --upgrade --quiet from bs4 import BeautifulSoup import requests # Pandas,datetime,matplotlib and seaborn libraray used for data analysis import pandas as pd import datetime import os # os.chdir('C:\Jovian\Python') update directory path if your running this on your local machine import matplotlib.pyplot as plt %matplotlib inline import seaborn as sns from datetime import date