Pandas provides us with a powerful apply() function to iterate through each value in a data frame and compute certain operations. The apply function is an alternative to creating a loop structure to iterate through a data frame. Similar to a map() function, it takes in any function as input and runs through each data point in the data frame to apply a specified operation. We can also specify the axis (either row-wise or column-wise) to which to apply the operation. We can also use lambda functions with apply(). In this post we will look at what’s apply, lambda, map and applymap functions, how they work and how to use it.

pandas-logo

Pandas Apply Function

Instead of using loops operation to iterate through each data point in our data frame we can use pandas apply() function. The apply method come with the following arguments; DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs). The func is a function to apply to each row or column. We have to specify the axis on which to operate where by default is 0 (row-wise).

Let’s start by creating data frame

                    

apply_data_df = pd.DataFrame(
    {
        "Students": ["Tom", "Peter","Simon", "Mary", "Jane","King","Hillary","Ethan","Page"],
        "Math": [79.00, 67.00,80.00, 84.00, 70.00,60.00,90.00,76.00,75],
        "Physics":[63.00, 98, 60.00, 90,84.00, 77.00,55.00,70,66.00],
        "Computer":[84.00,78.00, 57.00, 88.00, 75.00,93.00,92.00,98.00,90.00],
    }
)

apply_data_df

Get total for each Student

                    

apply_data_df['Total']=apply_data_df[['Math','Physics','Computer']].apply(np.sum,axis=1)
apply_data_df

Get total for each Course

                    

apply_data_df[['Math','Physics','Computer']].apply(np.sum,axis=0)

Using apply function to convert data from float to int

                    

apply_data_df[['Math','Physics','Computer']].apply(np.int64)

Pandas Lambda Function

Lambda is functional programming style of defining a function anonymously. It aims to achieve the readability of the code as it’s defined inline and are mostly used once. Lambda can be used together with apply to perform certain operations on each datapoint in a DataFrame. Apart from apply function, lambda function can be used along with other functions such as applymap(), filter() and map().

Lambda function

                    

add=lambda x,y:x+y
add(10,20)

Using lambda function to add 5 to the Math course

                    

apply_data_df['Math']=apply_data_df['Math'].apply(lambda x: x+5)
apply_data_df

Get Upper case names of students with lambda and apply

                    

apply_data_df['Students'].apply(lambda x: x.upper())

Lambda with user defined function in apply. Similar to above solution

                    

def upper_case(x):
    return x.upper()

apply_data_df['Students__Upper_Case_Names']=apply_data_df['Students'].apply(lambda x: upper_case(x))
apply_data_df

Use if-else with lambda and apply function

                    

apply_data_df['Major_Grades']=apply_data_df['Total'].apply(lambda x: 'A' if x>250 else 'B')
apply_data_df

Use if-elif-else with lambda and apply function

                    

# We can make it as complex as we want
apply_data_df['Detailed_Grades']=apply_data_df['Total'].apply(lambda x: 'A' if x>250 
                                                              else ('B' if (x>240) & (x<250) 
                                                                    else ('C' if(x<240) & (x>200) 
                                                                          else 'D') ))
apply_data_df

Pandas Map Function

Map is used for mapping values from one form to another. It’s defined on a series and accepts dictionary, series or callable as input argument. It performs elementwise operations on a series.

Using map function

                    

apply_data_df['Positions']=apply_data_df['Detailed_Grades'].map({'A':1,'B':2,'C':3,'D':4})
apply_data_df

Pandas Applymap Function

The applymap function applies a function to a DataFrame elementwise.

Using applymap

Here we just count the number of characters in each value (length) across the data frame.

                    

apply_data_df.applymap(lambda x: len(str(x)))

Let’s add 5 to Math course

Note that you can not use applymap to Series only e.g. use applymap on Math only, you’ll get an error so we add atleast two columns. The best use-case for this is to use apply function that can take either series or dataframe.

                    

apply_data_df[['Math','Computer']].applymap(lambda x: x+5)

For complete code check the jupyter notebook here.

Conclusion

Pandas provides us with powerful functions that we can apply to our data to perform complex operations in simple ways. We have looked at what’s apply, map and applymap functions and when to use each. We have also seen how to use lambda which gives us much flexibility to pass complex functions anonymously. In the next post we will at Reshaping DataFrame and check concepts like Pivot Tables, Pivot, cross tabulation, melt, stacking and unstacking. To learn about handing missing data check our previous post here.

Pandas Apply Function

Post navigation


0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x