Jovian
Sign In
Learn practical skills, build real-world projects, and advance your career

Pandas map(), apply(), & applymap() methods

Pandas map, apply, and applymap methods

Imgur

Pandas offers a variety of tools & methods to optimize the data loading, pre-processing, and analyzing process. Datasets with millions of rows can be processed using Pandas smoothly.

map, apply, & applymap are such methods that allow element-wise modification of a Dataframe or Series without using a loop, which simplifies data processing. In this post, we will look at the use case for these methods and how we can implement them.

Let's create a sample df to understand each.

import pandas as pd
a_dict = {'c1':[1,2,3], 'c2':[1,2,3],'c3':[1,2,3], 'c4':[1,2,3]}
df = pd.DataFrame.from_dict(a_dict, orient='columns')
df

map()

The map method can be used either to apply a custom function to each element of a series, or to map/substitute that value with another value derived from a dictionary/list.

Imgur

Syntax:

Series.map(arg, na_action=None)

Parameters:

  • arg: function, collections. (Mapping correspondence).

  • na_action{None, ‘ignore’}, default None

Returns Series with the same index as input.

1. Mapping sample values with a dictionary.

df
df['c1']=df.c1.map({1:'ONE', 2:'TWO',3:"THREE"}) 

df

Note that, when the mapping argument is a dictionary, the values in Series that are not in the dictionary are converted to NaN.

i.e, If we do not mention the substitute value for some value(present in the series) in the dictionary, it will be converted to NaN.

df.c2 = df.c2.map({1:'ONE', 2:'TWO'}) 
df

This brings us to how we can deal with NaN values while mapping. We can simply add the argument na_action='ignore' so as to avoid applying function to mising values.

2. Mapping sample values with a function.

def add_lowercase(val):
    return(val+'_'+val.lower())
df.c2.map(add_lowercase, na_action='ignore')
0    ONE_one
1    TWO_two
2        NaN
Name: c2, dtype: object

To make this a permanent change, you can simply equate the mapped series/column to the original datframe column.

apply()

The apply method can be used to apply a custom function to an entire column/row of a dataframe to return an aggregated result. It can be applied to both, a series or a dataframe but should be prefered for complex operations.

a_dict = {'c1':[1,2,3], 'c2':[1,2,3],'c3':[1,2,3], 'c4':[1,2,3]}
sample_df = pd.DataFrame.from_dict(a_dict, orient='columns')
sample_df

1. pd.Series.apply

def power(val):
  return val**val
sample_df['c5'] = sample_df['c3'].apply(power)
sample_df

A general practise should be to use these three functions with lambda.

sample_df['c4'].apply(lambda x: x*100)
0    100
1    200
2    300
Name: c4, dtype: int64

Since we are using the apply method on a specific column, we will see a similar result to map.

2. pd.DataFrame.apply:

Pandas dataframe.applyis used to apply a function along an axis of the DataFrame. Means, we have to explicitly provide the axis argument that defines whether we are operating row-wise(axis = 1) or column-wise(axis = 0). By default the axis = 0.

Imgur

sample_df
def func(vals):
  return vals.sum()
sample_df.apply(func) #column-wise aggregation creates an aggregated row
c1     6
c2     6
c3     6
c4     6
c5    32
dtype: int64
sample_df.apply(func, axis=1) #row-wise aggregation creates an aggregated column
0     5
1    12
2    39
dtype: int64

A column-wise aggregation can create an aggregated row whereas a row-wise aggregation creates an aggregated column. This can also be seen with the following implementation:

sample_df.loc['3'] = sample_df.apply(func)
sample_df
sample_df['c6'] = sample_df.apply(func, axis = 1)

sample_df

applymap()

The applymap method is another way to modify values but is only suited for Dataframes.

Imgur

def pow(val):
  return val**5
a_dict = {'c1':[1,2,3], 'c2':[1,2,3],'c3':[1,2,3], 'c4':[1,2,3]}
sample_df = pd.DataFrame.from_dict(a_dict, orient='columns')
sample_df
sample_df.applymap(lambda x: str(x)+'_')
sample_df.applymap(pow)

The function passed into applymap applies individually to all elements in the input dataframe.

!pip install jovian --upgrade --quiet

import jovian
|████████████████████████████████| 68 kB 3.3 MB/s eta 0:00:011 Building wheel for uuid (setup.py) ... done
jovian.commit()
[jovian] Detected Colab notebook... [jovian] jovian.commit() is no longer required on Google Colab. If you ran this notebook from Jovian, then just save this file in Colab using Ctrl+S/Cmd+S and it will be updated on Jovian. Also, you can also delete this cell, it's no longer necessary.
himani007
Himani5 months ago