Jovian
Sign In

Project 1 Rightmove With Functions E2449

How To Purchase Property, Using Web Scraping on Rightmove

banner-image

Rightmove is a popular property website in the United Kingdom; it is the country's largest online real estate portal and property website. This website currently lists tens of thousands of properties. Each property listing includes the price, the number of bedrooms, the location, and the contact information for the estate agent who will arrange a viewing for the prospective buyer.

The rightmove website allows users to search for properties by postcode or area. Additional filters can be applied by the website visitor to meet their specific requirements. For instance, a price range, a property type, a search radius, and the number of bedrooms.

alt

The page https://www.rightmove.co.uk/property-for-sale/find.html?locationIdentifier=REGION%5E1152 contains all available properties in Rugby, a small market town in eastern Warwickshire. In this project. we'll retrive information from multiple pages using web scraping: the process of extracting information from a website in an automated fashion using code. We'll use Python libraries Requests and Beautiful Soup to scarp data from this page. We will then use other libraries like pandas,matplotlib and seaborn to analyse data.

Here's an outline of the steps we'll follow:

  1. Download the webpage using Requests
  2. Parse the HTML source code using Beautiful Soup
  3. Extract property information and complie extracted information into Python lists and dictionaries
  4. Save the extracted information to CSV file.
  5. Data analysis using pandas,matplotlib and seaborn

By the end of the project, we'll create a csv file in the following format:

CSV

How to Run the Code

You can execute the code using the 'Run' buttom at the top of this page and selecting "Run the Binder". You can make changes and save your version of the notebook to Jovian[https://www.jovian.ai/] by executing the following cells:

The default setting for the code is 294 properties and Rugby as the region, but I also gathered some regional codes for the surrounding area, which can be easily found on the rightmove website.

  • Rugby : 5E1152
  • Coventry : 5E368
  • Milton Keynes : 5E940
  • Leicester : 5E789
  • Northampton : 5E1014

alt

 

We gathered data in the following four segments by utilising the below primary variables and custom formulas. The primary advantage of this approach is that it allows for time savings on repetitive tasks.

location = '5E1152'
base_url = 'https://www.rightmove.co.uk/property-for-sale/find.html?locationIdentifier=REGION%5E1152&index='
properties_per_page = 24
num_properties = 294
csv_filename = 'filename.csv'
# requests and beautifulsoup library used for Web Scraping

!pip install requests --upgrade --quiet
!pip install beautifulsoup4 --upgrade --quiet
from bs4 import BeautifulSoup
import requests

# Pandas,datetime,matplotlib and seaborn libraray used for data analysis  

import pandas as pd
import datetime
import os
# os.chdir('C:\Jovian\Python') update directory path if your running this on your local machine
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from datetime import date
pritesh
Pritesh Patel6 months ago