How to Approach Machine Learning Projects
This tutorial is a part of the Zero to Data Science Bootcamp by Jovian.
In this tutorial, we'll explore a step-by-step process for approaching ML problems:
- Understand the business requirements and the nature of the available data.
- Classify the problem as supervised/unsupervised and regression/classification.
- Download, clean & explore the data and create new features that may improve models.
- Create training/test/validation sets and prepare the data for training ML models.
- Create a quick & easy baseline model to evaluate and benchmark future models.
- Pick a modeling strategy, train a model, and tune hyperparameters to achieve optimal fit.
- Experiment and combine results from multiple strategies to get a better result.
- Interpret models, study individual predictions, and present your findings.
How to run the code
This tutorial is an executable Jupyter notebook hosted on Jovian. You can run this tutorial and experiment with the code examples in a couple of ways: using free online resources (recommended) or on your computer.
Option 1: Running using free online resources (1-click, recommended)
The easiest way to start executing the code is to click the Run button at the top of this page and select Run on Binder. You can also select "Run on Colab" or "Run on Kaggle", but you'll need to create an account on Google Colab or Kaggle to use these platforms.
Option 2: Running on your computer locally
To run the code on your computer locally, you'll need to set up Python, download the notebook and install the required libraries. We recommend using the Conda distribution of Python. Click the Run button at the top of this page, select the Run Locally option, and follow the instructions.
Let's install and import the required libraries.
!pip install numpy pandas matplotlib plotly seaborn --quiet