Learn practical skills, build real-world projects, and advance your career

Forecasting Housing Prices using Machine Learning

alt

Open in Google Colab and click the "Run" button to execute the code.

Business Problem Statement

Predicting House Prices based on Propery Type, Location & etc.

Evaluation criteria and loss functions
RMSE - Root Mean Square Error

RMSE=i=1N(xix^i)2NRMSE = \sqrt{\frac{\sum_{i=1}^{N}(x_i - \hat{x}_i)^2}{N}}

where

i = variable i

N = number of non-missing data points

\(x_i \) = actual observations time series

\( \hat{x}_i \) = estimated time series

In this notebook we will explore Supervised Machine Learning methods. Regression models such as linear regression, decision tree and ensemble models such as RandomForest, XGBoost, will trained to predict house prices using Scikit Learn, XGBoost. We will use Pandas, Numpy, Matplotlib, Seaborn and Plotly to perform exploratory data analysis and gather insights for machine learning. We will do the following

  • Install and Import libraries
  • Explore the dataset
  • Translate the business problem to a machine learning problem
  • Data Cleaning
  • EDA - exploratory data analysis
  • Feature Engineering
  • Data preparation - Train, Val & Test Split, Encoding and Scaling
  • Select input & output features
  • Define baseline model
  • Define evaluation metrics
  • Select best model without hyperparameter tuning
  • Hyperparameter tuning for select models
  • Make predictions
  • Save the best model
  • Summarise insights and learning

Installing and Importing Libraries