Pandas provides us with a powerful apply() function to iterate through each value in a data frame and compute certain operations. The apply function is an alternative to creating a loop structure to iterate through a data frame. Similar to a map() function, it takes in any function as input and runs through each data point in the data frame to apply a specified operation. We can also specify the axis (either row-wise or column-wise) to which to apply the operation. We can also use lambda functions with apply(). In this post we will look at what’s apply, lambda, map and applymap functions, how they work and how to use it.
Pandas Apply Function
Instead of using loops operation to iterate through each data point in our data frame we can use pandas apply() function. The apply method come with the following arguments; DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs). The func is a function to apply to each row or column. We have to specify the axis on which to operate where by default is 0 (row-wise).
Let’s start by creating data frame
apply_data_df = pd.DataFrame(
{
"Students": ["Tom", "Peter","Simon", "Mary", "Jane","King","Hillary","Ethan","Page"],
"Math": [79.00, 67.00,80.00, 84.00, 70.00,60.00,90.00,76.00,75],
"Physics":[63.00, 98, 60.00, 90,84.00, 77.00,55.00,70,66.00],
"Computer":[84.00,78.00, 57.00, 88.00, 75.00,93.00,92.00,98.00,90.00],
}
)
apply_data_df
Get total for each Student
apply_data_df['Total']=apply_data_df[['Math','Physics','Computer']].apply(np.sum,axis=1)
apply_data_df
Get total for each Course
apply_data_df[['Math','Physics','Computer']].apply(np.sum,axis=0)
Using apply function to convert data from float to int
apply_data_df[['Math','Physics','Computer']].apply(np.int64)
Pandas Lambda Function
Lambda is functional programming style of defining a function anonymously. It aims to achieve the readability of the code as it’s defined inline and are mostly used once. Lambda can be used together with apply to perform certain operations on each datapoint in a DataFrame. Apart from apply function, lambda function can be used along with other functions such as applymap(), filter() and map().
Lambda function
add=lambda x,y:x+y
add(10,20)
Using lambda function to add 5 to the Math course
apply_data_df['Math']=apply_data_df['Math'].apply(lambda x: x+5)
apply_data_df
Get Upper case names of students with lambda and apply
apply_data_df['Students'].apply(lambda x: x.upper())
Lambda with user defined function in apply. Similar to above solution
def upper_case(x):
return x.upper()
apply_data_df['Students__Upper_Case_Names']=apply_data_df['Students'].apply(lambda x: upper_case(x))
apply_data_df
Use if-else with lambda and apply function
apply_data_df['Major_Grades']=apply_data_df['Total'].apply(lambda x: 'A' if x>250 else 'B')
apply_data_df
Use if-elif-else with lambda and apply function
# We can make it as complex as we want
apply_data_df['Detailed_Grades']=apply_data_df['Total'].apply(lambda x: 'A' if x>250
else ('B' if (x>240) & (x<250)
else ('C' if(x<240) & (x>200)
else 'D') ))
apply_data_df
Pandas Map Function
Map is used for mapping values from one form to another. It’s defined on a series and accepts dictionary, series or callable as input argument. It performs elementwise operations on a series.
Using map function
apply_data_df['Positions']=apply_data_df['Detailed_Grades'].map({'A':1,'B':2,'C':3,'D':4})
apply_data_df
Pandas Applymap Function
The applymap function applies a function to a DataFrame elementwise.
Using applymap
Here we just count the number of characters in each value (length) across the data frame.
apply_data_df.applymap(lambda x: len(str(x)))
Let’s add 5 to Math course
Note that you can not use applymap to Series only e.g. use applymap on Math only, you’ll get an error so we add atleast two columns. The best use-case for this is to use apply function that can take either series or dataframe.
apply_data_df[['Math','Computer']].applymap(lambda x: x+5)
For complete code check the jupyter notebook here.
Conclusion
Pandas provides us with powerful functions that we can apply to our data to perform complex operations in simple ways. We have looked at what’s apply, map and applymap functions and when to use each. We have also seen how to use lambda which gives us much flexibility to pass complex functions anonymously. In the next post we will at Reshaping DataFrame and check concepts like Pivot Tables, Pivot, cross tabulation, melt, stacking and unstacking. To learn about handing missing data check our previous post here.