Introduction
This article explains the difference between apply and applymap function in Pandas pragmatically. Both functions appear similar but there is a difference between the way we implement them based on the conditions. Let’s explore them,
Setup
We will work on a Kaggle dataset that provides YouTube video trending statistics, URL: https://www.kaggle.com/datasnaek/youtube-new, and the file we are using is ‘USvideos.csv’ for this article.
df = pd.read_csv('USvideos.csv')
df.columns
The columns of the data set are
apply function
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
The apply function (func) is used to invoke the function along any axis of the DataFrame.
axis{0 or ‘index’, 1 or ‘columns’}, default 0
Axis along which the function is applied:
- 0 or ‘index’: apply function to each column.
- 1 or ‘columns’: apply function to each row.
Applying function column-wise
Let’s understand by example, first we will sort the given DataFrame in descending order of the number of ‘likes’ by users.
likesdf = df.sort_values(by='likes', ascending=False)
likesdf.head()
Through apply function we will change the title of any music video containing String ‘BTS’ to ‘BTS Video’, other videos which don’t contain the string we will not change there title.
def changetitle(title):
if("BTS" in title):
return "BTS Video"
likesdf['title'] = likesdf['title'].apply(changetitle)
likesdf.head()
since, we have not provided the axis, by default it’s 0 meaning the function will be applied column wise, after applying the function, resulting in.
We can see here; the title is changed to ‘BTS Video’.
Applying function row-wise
To understand the second flavor of the function, in the DataFrame there are 2 columns ‘likes’ and ‘dislikes’, let's create another column ‘total_views’ which is a sum of ‘likes’ plus ‘dislikes’ video row wise,
def totalLikesPlusDislikes(row):
return row['likes'] + row['dislikes']
likesdf['total_views'] = likesdf.apply(totalLikesPlusDislikes, axis=1)
likesdf.head()
Upon providing ‘axis=1’ the function is applied row-wise and new column ‘total_views’ is having the sum of likes + dislikes.
applymap function
DataFrame.applymap(func, na_action=None, **kwargs)
The applymap function use to apply function (func) elementwise.
df.applymap(lambda x : x + 1) // it will add 1 to each element of DataFrame, all columns of DataFrame must be of numeric type for ‘applymap’ function to work.
In our example, from the ‘likesdf’ DataFrame first fetch all the numeric columns to a separate DataFrame,
newlikesdf = likesdf.select_dtypes(include=np.number)
newlikesdf.head()
resulting in
Through applymap function, let’s multiply each column by any integer say 2.
newlikesdf.applymap(lambda x : x * 2)
Summary
apply: should be used when we want to apply a function column wise (axis = 0) or row wise (axis=1) and it can be applied to both numeric and string columns.
applymap: Should be used for element-wise operations.