“Math is the language of the universe” so they say. When it comes to data the only language that data can be able to speak to and communicate facts and truth is Maths. Mathematics is at the heart of data science, analytics and Machine Learning. Pandas provides us with powerful mathematical tools and techniques to manipulate and analyse data. In this post we will look at Pandas Mathematical Functions and how to use them. In the next post we will dive into pandas statistical functions and understand how to use them to interpret our data.

pandas-logo

Pandas Mathematical Functions

Create the DataFrame

                    

import pandas as pd
import numpy as np

students_score_df = pd.DataFrame(
    {
        "Students": ["Tom", "Peter", "Mary", "Smith"],
        "Reg_No": [1790, 1731, 1780, 1755],
        "Reg_Date": ["15/01/2021", "16/01/2021", "19/01/2021", "27/01/2021"],
        "Math": ["79.00", "67.00", "84.00", "70.00"],
        "Physics": ["60", "70", "50", "90"],
        "Computer": ["65.80", "80", "70", "75"],
    }
)

students_score_df

pandas-math-df

Check Data Types

                    

# Check if data types have proper data structure representation if not convert them to proper data types
students_score_df.dtypes

pandas-check-datatypes

Convert Data Types to Integer/Float

                    

# Convert data types to proper representation
students_score_df[['Math','Physics','Computer']]=students_score_df[['Math','Physics','Computer']].astype(np.float) # Math, Physics and Computer need to be float
students_score_df['Reg_No']=students_score_df['Reg_No'].astype(str) # Reg_No need to object
students_score_df['Reg_Date']=pd.to_datetime(students_score_df['Reg_Date']) # Reg_Date need to be a valid pandas date

students_score_df.dtypes

pandas-recheck-datatypes

 

Scalar Addition

Add a scalar value to every numeric element in the dataframe

                    

students_score_df[['Math','Physics','Computer']]=students_score_df[['Math','Physics','Computer']].add(5)
students_score_df.head()

pandas-scalar-addition

Element-wise addition

Add two dataframes element-wise

                    

score_1_df=students_score_df[['Math','Physics','Computer']]-90
score_2_df=students_score_df[['Math','Physics','Computer']]-85

score_1_df
score_2_df

pandas-math-score-df1pandas-math-score-df2

Add two dataframes element-wise

                    

# add the two dataframes
score_3_df=score_1_df.add(score_2_df)
score_3_df

pandas-math-element-wise-add

Subtraction with Scalar value

students_score_df

pandas-math-student-score-df

                    

students_score_df[['Math','Physics','Computer']]=students_score_df[['Math','Physics','Computer']]-50

students_score_df

pandas-scalar-subtraction

Element-wise Subtraction

Subtract two dataframes element-wise

                    

# score_2_df-score_1_df # Option 1
score_4_df=score_2_df.sub(score_1_df) # Option 2
score_4_df

pandas-math-element-wise-subtraction

Multiplication with Scalar value

                    

students_score_df[['Math','Physics','Computer']]=students_score_df[['Math','Physics','Computer']]*-3
students_score_df

pandas-scalar-multiply

 

Element-wise Multiplication

Multiply two dataframes element-wise

pandas-math-element-wise-multiply

Division with Scalar value

                    

students_score_df[['Math','Physics','Computer']]=students_score_df[['Math','Physics','Computer']]/3
students_score_df

pandas-scalar-division

Element-wise Division

Divide two dataframes element-wise

                    

# score_1_df / score_2_df # Option 1
score_1_df.div(score_2_df) # Option 2

pandas-math-element-wise-divide

Pandas power function

Using ** to raise element to specified power

                    

score_1_df**2 # Raise each element to power 2

pandas-math-power-2star

Using power pow() function

The pow() function calculates the exponential power of dataframe and other, element-wise (binary operator pow). It resembles the ** operator but allows handling of missing values.

                    

score_1_df=score_1_df.pow(2)
score_1_df

pandas-math-power-pow

Element-wise power along specified axis

We can specify axis when performing exponential power of two dataframes or a dataframe and a series

Logarithm on base 2

We can use NumPy log function to perform logarithmic operation on dataframe.

                    

score_1_df['Log2_Computer']=np.log2(score_1_df['Computer'])
score_1_df

pandas-math-log-base-2

Logarithm on base 10

                    

score_1_df['Log10_Computer']=np.log10(score_1_df['Computer'])
score_1_df

pandas-math-log-base-10

Natural logarithmic

                    

score_1_df['Natural_Log_Computer']=np.log(score_1_df['Computer'])
score_1_df

pandas-math-natural-log

Pandas aggregate agg function

score_1_df

pandas-math-agg-func

Aggregate by mean function

pandas-math-mean-agg-func

Aggregate values a long a specified axis

                    

score_1_df.agg('mean',axis=1) # axis=1 implies columnwise

pandas-math-agg-func-along-axis-1

Nesting multiple aggregations

                    

score_1_df.agg(['min','max','sum','mean','std','var'])

pandas-math-nest-agg-func

Using different agg() functions on each column

                    

score_1_df.agg({'Math':['sum','min','max'],'Log2_Computer':['mean'],
'Log10_Computer':['std'],'Natural_Log_Computer':['var']})

pandas-math-different-agg-func

Subtract one year from the date

                    

students_score_df['Reg_Date_Less_1_Yr']=students_score_df['Reg_Date']-pd.DateOffset(years=1)
students_score_df

pandas-math-subtract-year-from-date

Subtract one month from the date

                    

students_score_df['Reg_Date_Less_1_Mn']=students_score_df['Reg_Date']-pd.DateOffset(months=1)
students_score_df

pandas-math-subtract-month-from-date

Subtract one day from the date

                    

students_score_df['Reg_Date_Less_1_Day']=students_score_df['Reg_Date']-pd.DateOffset(days=1)
students_score_df

pandas-math-subtract-day-from-date

For complete code check the jupyter notebook here.

Conclusion

In this post we have looked at various common Mathematical functions and operations in pandas. Every analysis and insight has a mathematical underpinning hence having the maths skill to manipulate and analyse the data is key. In the next post we will learn about Pandas Statistical Functions and how they are used in data. To learn about how to work with dates and time in Pandas, check our previous post here.

Pandas Mathematical Functions

Post navigation


0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x