Sign In

Scraping Gadgets360 Review


What is Web Scraping: Introduction

Web scraping typically extracts large amounts of data from websites for a variety of uses such as price monitoring, enriching machine learning models, financial data aggregation, monitoring consumer sentiment, news tracking, etc. Browsers show data from a website. However, manually copy data from multiple sources for retrieval in a central place can be very tedious and time-consuming. Web scraping tools essentially automate this manual process.
Image is taken from

Web scraping is use in various industry to collect data from website and help us to take nesseary action.

for example

  1. Competitor Price Monitoring
  2. Monitoring MAP Compliance
  3. Fetching Images and Product Descriptions
  4. Monitoring Consumer Sentiment
  5. Aggregated News Articles
  6. Market Data Aggregation
  7. Extracting Financial Statement
  8. Real-Time Analytics for data science etc

Image is taken from

Project Outline:

  1. We are going to scrape to built a dataset.
  2. we will get review title, Review author, Purlished date, category and a link of the particular review.
  3. For each page we will get 20 review descriptions.
  4. Finally we will create and save .csv file for future use.

First We install and import all required library for this project.

In this project we are using request library to get data from websites and BeautifulSoup to parse the webpage and extract valuable html data in text format for further process

Creating Environment i.e Install all required library and import to that program

!pip install requests --upgrade --quiet
!pip install pandas requests BeautifulSoup4 --upgrade --quiet
from bs4 import BeautifulSoup
import requests 
import csv
import pandas as pd

Our base webpage is
in the url whatever numeric no we put it will go to that review page. I already inform that webpage containes many review in many pages. every page has 20 review. so we need to go each page to collect data.
for this in the very begining it ask for the input page no. that means how many pages we want to scrap.
for example if we input no 5. it scrap 5 review pages from 1-5. so total review we can get is 20*5 = 100

see in the program url = base_url + str(page)
str(page) is like 1,2,3..... depends upon user input ----- For first page ----- Fot the second page
. ----- For the eighteen page and so on.
for this below code i am taking only first page for scrap

Dipjyoti Ghosh6 months ago