Jovian
Sign In
Learn practical skills, build real-world projects, and advance your career

Learn how machine learning models can predict the win percentage of players in the popular game, PUBG. Explore and analyze the dataset, train different models, and perform hyperparameter tuning. Join the Kaggle competition now!

PUBG Finish Placement Prediction

About the Game

PlауеrUnknоwn’ѕ Battlegrounds, bеttеr knоwn аѕ PUBG, іѕ a multірlауеr battle rоуаlе gаmе іn whісh players drop оntо аn іѕlаnd and fіght tо bе thе last оnе lеft standing amongst a maximum of 100 opponents. Players are dropped via parachute on an island where they have to gather supplies and compete to be the last one standing. There is a 'zone' that shrinks with time. Players who fail to make it to this zone in time are eliminated automatically. There are various types of weapons that players can aquire that are randomly appear throughout the map. The most rare and effective items are dropped via supply crates which the players can open. There are multiple maps that the players can choose to play on. Players can also choose to team up with friends or random teamates in groups of two(duo) or four(squad). The dataset at hand contains metrics of various players which we will use to train machine learning models and try to predict the win percentage of that player where 1 would stand for 1st place and 0 for last place.

Kaggle Competition Link

Outline

  • Download the dataset
  • Explore and analyse the dataset
  • Prepare the dataset for ML training(baseline)
  • Train a hard coded model for reference
  • Perform feature engineering
  • Train and evaluate different models
  • Hyperparameter tuning
  • Make predictions on test data

Download the Dataset

  • Install required libraries
  • Download data from Kaggle
  • Load Training set with Pandas
  • Load test set with Pandas

Column Description

  • DBNOs - Number of enemy players knocked.
  • assists - Number of enemy players this player damaged that were killed by teammates.
  • boosts - Number of boost items used.
  • damageDealt - Total damage dealt. Note: Self inflicted damage is subtracted.
  • headshotKills - Number of enemy players killed with headshots. -heals - Number of healing items used.
  • Id - Player’s Id
  • killPlace - Ranking in match of number of enemy players killed.
  • killPoints - Kills-based external ranking of player. (Think of this as an Elo ranking where only kills matter.) If there is a value other than -1 in rankPoints, then any 0 in killPoints should be treated as a “None”.
  • killStreaks - Max number of enemy players killed in a short amount of time.
  • kills - Number of enemy players killed.
  • longestKill - Longest distance between player and player killed at time of death. This may be misleading, as downing a player and driving away may lead to a large longestKill stat.
  • matchDuration - Duration of match in seconds.
  • matchId - ID to identify match. There are no matches that are in both the training and testing set.
  • matchType - String identifying the game mode that the data comes from. The standard modes are “solo”, “duo”, “squad”, “solo-fpp”, “duo-fpp”, and “squad-fpp”; other modes are from events or custom matches.
  • rankPoints - Elo-like ranking of player. This ranking is inconsistent and is being deprecated in the API’s next version, so use with caution. Value of -1 takes place of “None”.
  • revives - Number of times this player revived teammates.
  • rideDistance - Total distance traveled in vehicles measured in meters.
  • roadKills - Number of kills while in a vehicle.
  • swimDistance - Total distance traveled by swimming measured in meters.
  • teamKills - Number of times this player killed a teammate.
  • vehicleDestroys - Number of vehicles destroyed.
  • walkDistance - Total distance traveled on foot measured in meters.
  • weaponsAcquired - Number of weapons picked up.
  • winPoints - Win-based external ranking of player. (Think of this as an Elo ranking where only winning matters.) If there is a value other than -1 in rankPoints, then any 0 in winPoints should be treated as a “None”.
  • groupId - ID to identify a group within a match. If the same group of players plays in different matches, they will have a different groupId each time.
  • numGroups - Number of groups we have data for in the match.
  • maxPlace - Worst placement we have data for in the match. This may not match with numGroups, as sometimes the data skips over placements.
  • winPlacePerc - The target of prediction. This is a percentile winning placement, where 1 corresponds to 1st place, and 0 corresponds to last place in the match. It is calculated off of maxPlace, not numGroups, so it is possible to have missing chunks in a match.