FEATURE ENGINEERING TECHNIQUES
Feature engineering is the process of creating new features from raw data to increase the predictive power of the learning algorithm. Engineered features should capture additional information that is not easily apparent in the original feature set.
In this notebook, I will try to cover most of the common techniques for feature engineering.
We will learn about :
Before diving into the feature engineering, let's see the lifecycle of a datascience project and have insight about where exactly the feature engineering is performed.
Life cycle of a data science project
Life cycle of a data science project is comprised of various phases each of them has their own importance for solving the problem related to the interested domain. The phases are as follows :
Defining the problem statement
Data collection strategy :
This phases deals with the collection of data using various method, tools, techniques and sources that includes web APIs, web scraping, company's database, surveys etc.
Feature engineering :
Handling missing values, feature normalization, feature scaling, new feature generation, handling unbalanced data etc
Feature selection :
Select only those features that are highly correlated with the target feature. Dropping highly correlated independent features.
Exploratory data analysis
Here we will understand the data and relations among various features. We try to inference as much as information that lies within our topic of interest.
We will prepare out machine learning model.
Mode is evaluated based on its accuracy, effictiveness and fitness. The evaluation report allows us to decide whether the model is ready for the deployment or model needs some optimization, audjustment or different model to be build.
Model is made available to solve the related problems.