project_name = "german-credit-risk"

German Credit Data Analysis

Loans form an integral part of banking operations. However, not all the loans are promptly returned and hence it is important for a bank to closely monitter its loan applications. This project is an analysis of the German credit data. It contains details of 1000 loan applicants with 20 attributes and the classification whether an applicant is considered a Good or a Bad credit risk.

In this project, the relationship between the credit risk and various attribues will be explored through basic statistical techniques, and presented through visualizations.

Contents

  1. Import data
  2. Data preparation, cleaning
  3. Exploratory data analysis
  4. Feature engineering
  5. Models
  6. Summary
  7. References, future work

1. Import data

Let's begin by downloading the data from the UCI Machine Learning repository.

from urllib.request import urlretrieve
urlretrieve('http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data', 'german.data')
('german.data', <http.client.HTTPMessage at 0x7f54b6f9f5d0>)

The dataset has been downloaded and extracted.