Learn practical skills, build real-world projects, and advance your career

Swedish Auto Insurance Dataset

1. Introduction

The Swedish Auto Insurance Dataset involves predicting the total payment for all claims in thousands of Swedish Kronor, given the total number of claims.
It is a regression problem. It is comprised of 63 observations with 1 input variable and one output variable. The variable names are as follows:

  1. Number of claims.
  2. Total payment for all claims in thousands of Swedish Kronor.
## Loading the dataset from github repo

import warnings
warnings.filterwarnings("ignore")
import pandas as pd

url='https://raw.githubusercontent.com/hargurjeet/MachineLearning/Swedish-Auto-Insurance-Dataset/insurance.csv'

df_raw=pd.read_csv(url,sep='delimiter', header=None,  engine='python')
## Dropping intial junk values,renaming the column and resetting the index values
df = df_raw.drop([0, 1, 2, 3], axis=0).reset_index(drop=True).rename(columns={0:'No_Of_Claims'})
df = df.No_Of_Claims.str.split(',',expand=True).rename(columns={0:'No_Of_Claims', 1:'Total_Payment'})
df.head()
df.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 63 entries, 0 to 62 Data columns (total 2 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 No_Of_Claims 63 non-null object 1 Total_Payment 63 non-null object dtypes: object(2) memory usage: 1.1+ KB