Zerotoanalyst Project1
CITYPOPULATION WEB SCRAPER
By: Rohan Dawar
In this project I will be scraping the website citypopulation.de with BeautifulSoup to create a csv file of populations for sub-national entities
###What is scraping?
- Web scraping is the process of extracting content and data from a website through it's HTML code.
###What is https://www.citypopulation.de/ ?
- This website provides up to date data on population and areas for all countries of the world, including territories and subdivisions
###What is Beautiful Soup ?
- Beautiful Soup (AKA BS4) is a Python library for pulling data out of HTML pages
- In this project, I will be using BS4 to get the country pages within a continent, as well as parsing the population data from the subdivisions of that country
###What are sub-national entities?
- Sub-national entities are any administrative or census division within a country such as provinces, states, territories, municipalities, etc.
- citypopulation.de tries to keep up to date population data for all national and sub-national entities on Earth
###Outline:
base_url = 'https://www.citypopulation.de'
import requests
from bs4 import BeautifulSoup
# List of continents we want to parse:
continents = ['Africa', 'Asia', 'Europe', 'America', 'Oceania']
Rohan Dawar6 months ago