Data Science Pandas

7.Mastering Time Series in Python: DateTime Indexing, Resampling, and Grouping

Introduction

Time series data—data collected over time at regular or irregular intervals—is fundamental in many fields like finance, IoT, sales, and weather monitoring. Understanding how to efficiently manage and analyze this type of data can unlock trends, reveal seasonality, and drive predictive insights. Python, with its powerful pandas library, offers robust tools for working with time-indexed data.

This tutorial focuses on three critical concepts for time series analysis:

  1. DateTime Indexing – making time the core of your DataFrame.

  2. Resampling and Frequency Conversion – converting between different time intervals.

  3. Time-based Grouping – summarizing and aggregating time periods.

Each section comes with two distinct examples, showing how to apply these techniques to different datasets like temperature logs, sales records, and stock data. Whether you’re new to time series or seeking to refresh your skills, these examples will help you confidently manipulate and analyze temporal data.

Let’s dive into the world of time-driven data with Pandas!


1. DateTime Indexing

Example 1: Temperature Readings

import pandas as pd

temps = [20, 21, 19, 22]
dates = pd.date_range('2025-01-01', periods=4, freq='D')
df1 = pd.DataFrame({'temperature': temps}, index=dates)

print(df1['2025-01-02'])

Output:

temperature 21
Name: 2025-01-02 00:00:00, dtype: int64

Example 2: Sales Data

sales_data = {'sales': [200, 340, 290, 310]}
sales_dates = pd.to_datetime(['2025-03-01', '2025-03-03', '2025-03-05', '2025-03-07'])
df2 = pd.DataFrame(sales_data, index=sales_dates)

print(df2['2025-03'])

Output:

sales
2025-03-01 200
2025-03-03 340
2025-03-05 290
2025-03-07 310

2. Resampling and Frequency Conversion

Example 1: Downsampling (Daily to Monthly)

daily_data = pd.Series([10, 12, 15, 20, 25, 28, 30],
index=pd.date_range('2025-03-01', periods=7, freq='D'))

monthly_avg = daily_data.resample('M').mean()
print(monthly_avg)

Output:

2025-03-31 20.0
Freq: M, dtype: float64

Example 2: Upsampling (Monthly to Daily with Forward Fill)

monthly_sales = pd.Series([300, 450],
index=pd.to_datetime(['2025-01-01', '2025-02-01']))

daily_filled = monthly_sales.resample('D').ffill()
print(daily_filled.head(10))

Output:

2025-01-01 300
2025-01-02 300
2025-01-03 300
2025-01-04 300
2025-01-05 300
2025-01-06 300
2025-01-07 300
2025-01-08 300
2025-01-09 300
2025-01-10 300
Freq: D, dtype: int64

3. Time-based Grouping

Example 1: Grouping by Month

data = {'visits': [100, 150, 120, 170]}
dates = pd.to_datetime(['2025-01-10', '2025-01-25', '2025-02-01', '2025-02-18'])
df = pd.DataFrame(data, index=dates)

monthly_summary = df.groupby(df.index.month).sum()
print(monthly_summary)

visits
1 250
2 290

Example 2: Grouping by Day of the Week

sales = {'sales': [120, 130, 115, 140, 150, 170, 160]}
dates = pd.date_range('2025-03-01', periods=7, freq='D')
df = pd.DataFrame(sales, index=dates)

weekly_trend = df.groupby(df.index.day_name()).mean()
print(weekly_trend)

Output:

Friday 150.0
Monday 140.0
Saturday 120.0
Sunday 130.0
Thursday 170.0
Tuesday 115.0
Wednesday 160.0
Name: sales, dtype: float64

Summary

Time series data is at the core of forecasting and temporal analysis. With Pandas, you can unlock powerful tools like:

  • DateTime indexing to filter data by date or period.

  • Resampling to convert between different frequencies (daily to monthly, monthly to daily, etc.).

  • Grouping by time units (like month or weekday) for summarized insights.

You now have clear examples and outputs that demonstrate how to apply these techniques in real-life scenarios.

Leave a Reply

Your email address will not be published. Required fields are marked *