Sign In

Amazon Best Seller Web Scraping


amazon is an American multinational technology company specialized in e-commerce, cloud computing, artificial intelligent. The platform is among the best in the industry, where many variety of items can be purchase.

Amazon has listed best sellers in alphabetic order that could be found in Amazon Best Sellers. The page provides a list of items categories regrouped in department(about 40 variety). In this project, we are going to retrieve amazon best seller items in a variety of categories using web scraping. To achieve that we will use Python libraries resquests and BeautifulSoup to fetch, parse and extract the information we need from the web page.


Here is an outline of the steps we will follow:

  • Install and import libraries
  • Download and Parse the Bestseller HTML page source code using request and Beautifulsoup to get item categories topics URL.
  • Repeat step 2 for each item topic obtained using the corresponding URL
  • Extract information from each page
  • Combine the extracted information Extract information from each page's data in a Python Dictionaries
  • Save the information data to CSV file Using Pandas library

By the end of the project, we’ll create a CSV file in the following format:

Topic,Topic_url,Item_description,Rating out of 5,Minimum_price,Maximum_price,Review,Item Url
Amazon Devices & Accessories,,Fire TV Stick 4K streaming device with Alexa Voice Remote | Dolby Vision | 2018 release,4.7,39.9,0.0,615699,",200_.jpg"
Amazon Devices & Accessories,,Fire TV Stick (3rd Gen) with Alexa Voice Remote (includes TV controls) | HD streaming device | 2021 release,4.7,39.9,0.0,1844,",200_.jpg"
Amazon Devices & Accessories,,"Amazon Smart Plug, works with Alexa – A Certified for Humans Device",4.7,24.9,0.0,425090,",200_.jpg"
Amazon Devices & Accessories,,Fire TV Stick Lite with Alexa Voice Remote Lite (no TV controls) | HD streaming device | 2020 release,4.7,29.9,0.0,151007,",200_.jpg"

How to Run the Code

You can execute the code using the “Run” button at the top of the page and selecting “ Run on Binder “. You can make changes and save your version of the notebook in Jovian by executing the following cells.

Notice: Any department on the bestseller page got 40 items categories wherein each category is listed the best 100 items on 2 pages(50 items per page)Due to captcha problems few pages couldn't be accessible.