Updated 3 years ago

# Overview

We have been given techonolgy employment fo the years 2018,2019,2020. We are to perform `Data Wrangling`

and `Data Analysis`

with the aid of `pandas`

, `matplotlib`

and `plotly`

```
import matplotlib.pyplot as plt
plt.figure(num=None, figsize=(8, 6), dpi=80, facecolor='w', edgecolor='k')
import seaborn as sns
import numpy as np
import pandas as pd
```

`<Figure size 640x480 with 0 Axes>`

### Task 1: Data Loading and Data Aggregation

- Load the 3 data files into the variables data_18, data_19, data_20.

```
data_18 = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/IT_Salary_Survey_EU_18-20/Survey_2018.csv")
data_19 = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/IT_Salary_Survey_EU_18-20/Survey_2019.csv")
data_20 = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/IT_Salary_Survey_EU_18-20/Survey_2020.csv")
```

### Task 2: Data Analysis

- Display the first 5 rows of the 2018 survey data
- Display a concise summary of the 2020 data and list out 3 observations/inferences that you observe from the result. For this you will need to use the info() method.
- Display the descriptive statistics of the 2018 survey data
- Display the number of missing values in each column of the 2018 survey data

How many people responded to the survey in each of the 3 years? Has the number increased or decreased over the years? - Display all the unique values and their frequency in the column - “Number of vacation days” of 2020 data. Write down your observations (at least one) for this result.