Learn practical skills, build real-world projects, and advance your career

Exploratory Analysis on Google Playstore Apps

image

Introduction

Nowadays, almost everyone uses a smartphone. As an App Store, Google Play Store tends to be the main market for smartphone's app. Now, Google Play Store hosts more than 3.5 million Android Apps, and 98% of the apps are free to download and install. because of that, it would be interesting to do some analysis on Google Play Store dataset. In this project, we will analyze some feature of this data to gain insightful information about the app on Google Play Store.

About the dataset

This dataset is scrapped from Google Play Store. This dataset consist of 10841 rows and 13 columns. The columns of this dataset include :

1. App

This column consist of the name of app on Google Play Store.

2. Category

This column consist of the categories the app belongs to. For example, the app "Subway Surfers" is on the GAMES category.

3. Rating

This column consist of the rating of each app in the App column. the values of this column ranges between 0 up to 5.

4. Reviews

This column describe the number of users review for the app (as when scrapped).

5. Size

This column describe size of the app (as when scrapped).

6. Installs

This column describe the number of users downloads/installs for the app (as when scrapped).

7. Type

This column describe whether the app is free or paid.

8. Price

This column describe the price of the app (as when scrapped). if the app is free, then price = 0.

9. Content Rating

This column describe the age group the app is targeted at, such as childer, teen, adult, etc.

10. Genres

This column describe the genres of the app. For example, a games app can have genre such as adventure, arcade, etc. One app can belongs to multiple genre.

11. Last Update

This column shows when the app last updated (as when scrapped).

12. Current Ver

This column shows the current version of the app ( as when scrapped).

13. Android Ver

This column specify the minimum android version required to install/use the app properly (as when scrapped).

About This Project

This Exploratory Data Analysis project is part of Data Analysis With Python : Zero to Pandas course, a course provided by Jovian.ai. In this project, we will analyze some feature of Google Play Store to gain useful information of the dataset such as spread of categories of Apps on Google Play, determine the most popular apps, users satifaction with the app,

How to run the code

This is an executable Jupyter notebook hosted on Jovian.ml, a platform for sharing data science projects. You can run and experiment with the code in a couple of ways: using free online resources (recommended) or on your own computer.

Option 1: Running using free online resources (1-click, recommended)

The easiest way to start executing this notebook is to click the "Run" button at the top of this page, and select "Run on Binder". This will run the notebook on mybinder.org, a free online service for running Jupyter notebooks. You can also select "Run on Colab" or "Run on Kaggle".

Option 2: Running on your computer locally
  1. Install Conda by following these instructions. Add Conda binaries to your system PATH, so you can use the conda command on your terminal.

  2. Create a Conda environment and install the required libraries by running these commands on the terminal:

conda create -n zerotopandas -y python=3.8 
conda activate zerotopandas
pip install jovian jupyter numpy pandas matplotlib seaborn opendatasets --upgrade
  1. Press the "Clone" button above to copy the command for downloading the notebook, and run it on the terminal. This will create a new directory and download the notebook. The command will look something like this:
jovian clone notebook-owner/notebook-id
  1. Enter the newly created directory using cd directory-name and start the Jupyter notebook.
jupyter notebook

You can now access Jupyter's web interface by clicking the link that shows up on the terminal or by visiting http://localhost:8888 on your browser. Click on the notebook file (it has a .ipynb extension) to open it.

Downloading the Dataset

This dataset is retrieved from kaggle dataset : https://www.kaggle.com/lava18/google-play-store-apps

There are several ways to download the dataset into jupyter notebook :

  • download the dataset manually on kaggle and upload it to jupyter
  • download dataset in csv raw format using urlretrieve from urllib package (we will use this method now)
  • download using opendatasets package

Let's begin by downloading the data, and listing the files within the dataset. As i said before, i use the dataset provided by kaggle. i have download the dataset and upload it into my github repository such that we can retrieve it using urllib.request library.

dataset_url = 'https://raw.githubusercontent.com/fikrinotes/kaggle-datasets/main/googleplaystore/googleplaystore.csv'