Dates play an important role in our life and so in our analyses. Most data analyses are based on a specific timeline. Working with dates in any analytical use-case is a fundamental task that each data practitioner has encountered with. Python and Pandas provides us with adequate tools to work with and manipulate date and time functions. This enables us to easily create powerful and insightful time series analyses and predictions for business cases. In this post we will look at Dates in Pandas and the most common pandas and Python date and time functions and how to use them to analyse our data. You can download the dataset for this post here.
Pandas Date Functions
Load Data
clv_df=pd.read_csv('clv.csv')
clv_df['text_with_date']='Some text '+clv_df['InvoiceDate'] +'Other text'
clv_df.head()
Check if the date columns has datetime type
clv_df.dtypes
Convert a String Columns to Datetime object
# clv_df['valid_invoice_date_object']=pd.to_datetime(clv_df['InvoiceDate'])
clv_df['valid_invoice_date_object']=pd.to_datetime(clv_df['InvoiceDate'],format='%d/%m/%Y') # add format
clv_df['date_only']=pd.to_datetime(clv_df['text_with_date'],format='Some text %d/%m/%YOther text') # add format
clv_df.head()
Check data types again
clv_df.dtypes
Extract Year from date
clv_df['valid_invoice_date_object'].dt.year.head()
Extract Month from date
clv_df['valid_invoice_date_object'].dt.month.head()
Extract Day from date
clv_df['valid_invoice_date_object'].dt.day.head()
Extract Day of Year from date
clv_df['valid_invoice_date_object'].dt.dayofyear.head()
Extract Week of Year from date
clv_df['valid_invoice_date_object'].dt.isocalendar().week.head()
Extract Day of Year from date
# clv_df['valid_invoice_date_object'].dt.dayofweek.head()
clv_df['valid_invoice_date_object'].dt.isocalendar().day.head()
Name of Day
def week_day(x):
return pd.Timestamp(x).day_name()
clv_df['valid_invoice_date_object'].apply(lambda x: week_day(x)).head()
Extract Quarter of Year from date
clv_df['valid_invoice_date_object'].dt.quarter.head()
Extract Number of Days in a month
clv_df['valid_invoice_date_object'].dt.days_in_month.head()
Check for leap year
clv_df['valid_invoice_date_object'].dt.is_leap_year.head()
Extract Hour from date
First let us create a dataframe with date time column
dates=pd.date_range('1/12/2022', periods = 100, freq ='H')
date_df=pd.DataFrame(dates,columns=['DatetimeColumn'])
date_df.head()
Now extract hour from datetime column
date_df['DatetimeColumn'].dt.hour.head()
Extract Minute from date
date_df['DatetimeColumn'].dt.minute.head()
Extract Second from date
date_df['DatetimeColumn'].dt.second.head()
Python Date Functions
Python Basic Date Functions
from datetime import date
python_date=date.today()
print(python_date)
python_date.day
python_date.month
python_date.year
Python Basic Time Functions
from datetime import time
python_time=time(15,12,23,56)
print(python_time)
python_time.hour
python_time.minute
python_time.second
Python Basic Datetime Functions
from datetime import datetime
python_datetime=datetime.now()
print(python_datetime)
python_datetime.date()
python_datetime.time()
python_datetime.strftime('%A') # Add formart to extract more features. Refer to Python documentation
# for more details https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes
Python Show Full Year Calendar
import calendar
print(calendar.calendar(2022))
One Month Calendar
print(calendar.month(2022,5))
For complete code check the jupyter notebook here.
Conclusion
In this post we have looked at various functions for date and time manipulation form pandas and Python. Datetime data is important for time-series analysis and forecasting. Acquiring the skills to work with dates and time is important for every data scientist and analytical practitioner. In the next post we will look at various Mathematical functions that pandas provides us with and how to apply them. To learn about string manipulation in pandas check our previous post here.