Learn how to predict deal probability for Avito online ads using supervised machine learning methods such as linear regression, decision tree, and ensemble models. Explore the dataset, perform EDA, feature engineering, data preparation, and hyperparameter tuning to achieve a low RMSE. #MachineLearning #AvitoOnlineAds #DataScience
When selling used goods online, a combination of tiny, nuanced details in a product description can make a big difference in drumming up interest.
Avito, Russia’s largest classified advertisements website is challenging to predict demand for an online advertisement based on its full description (title, description, images, etc.), its context (geographically where it was posted, similar ads already posted) and historical demand for similar ads in similar contexts. With this information, Avito can inform sellers on how to best optimize their listing and provide some indication of how much interest they should realistically expect to receive.
The regression model should be evaulated for Root Mean Squared Error 𝑅𝑀𝑆𝐸.
RMSE is defined as:
\[\ {RMSE} = \sqrt{\frac{1}{n} \sum_{i=1}^{n} 𝑦𝑖−𝑦̂ 𝑖^2}\]
where y hat is the predicted value and y is the original value.
In this notebook we will explore Supervised Machine Learning methods. Regression models such as linear regression, Ridge, ElasticNet, Lasso, decision tree and ensemble models such as RandomForest, XGBoost, LightGBM will trained to predict weekly sales using Scikit Learn, LightGBM and XGBoost. We will use Pandas, Numpy, Matplotlib, Seaborn and Plotly to perform exploratory data analysis and gather insights for machine learning. We will do the following
!pip install jovian --upgrade --quiet