Sign In

Workproject Data Analysis

Exploratory data analysis of Price Index Variation for basic goods

Image of Basket


This tutorial aims to teach the main tools of the Python language and its specific libraries for data analysis.

It will be done through the databases of the national statistics department of Colombia, the data was obtained from the following website and link. This is the Original CSV file

The family basket is the name given to those foods that, according to the government, make up the basic food goods for any family, including the essential goods to fight the covid flu. Likewise, the data is divided into two variations; implicit and explicit prices for the basic goods in the family basket.

Let's check the price behavior for the most traded goods by the city during the pandemic peak.

The selected data to work on:

  • Product or basic good: Good

  • Product name given: Product_Name

  • product manufacturing company: Manufacturer

  • City: Town

  • Daily march prices index

The essential steps for the analysis will be done as follows:

  1. Data cleaning
  2. Modification of terms and words
  3. Data creation
  4. Graphics
  • 4.5. Grafic Interpretation
  1. Reference and future work

At the end we will make an economic conclusion of the result of the study.

The most important we should do es recall the necesary libraries that let us work with the data:

  • Pandas, offers data structures and operations for manipulating numerical tables and time series
  • Numpy, is a library that support large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
  • Seaborn, Plotlty and Matplot are the fundamental libraries to display all the graphic information from the data.
  • Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the leaflet.js library. Manipulate your data in Python, then visualize it in on a Leaflet map.
import jovian
import numpy as np
from wordcloud import WordCloud
import pandas as pd
import seaborn as sns
import as px
import matplotlib
import matplotlib.pyplot as plt
import folium 
import random
%matplotlib inline