Skip to content

Working with Dates and Times (to_datetime, dt accessor)

Why datetime is a core analytics skill

Dates show up everywhere:

  • Orders, sessions, support tickets
  • Time-based KPIs (DAU/WAU/MAU)
  • Trend analysis and seasonality

In real datasets, datetime values often start as strings, so you must parse them correctly.

Example dataset

Datetime example
import pandas as pd
 
df = pd.DataFrame({
    "order_id": [101, 102, 103, 104],
    "order_time": [
        "2025-01-01 10:15:00",
        "2025-01-01 10:45:00",
        "2025-01-02 09:10:00",
        "2025-01-03 18:30:00",
    ],
    "amount": [250, 180, 90, 300],
})
 
print(df.dtypes)
Datetime example
import pandas as pd
 
df = pd.DataFrame({
    "order_id": [101, 102, 103, 104],
    "order_time": [
        "2025-01-01 10:15:00",
        "2025-01-01 10:45:00",
        "2025-01-02 09:10:00",
        "2025-01-03 18:30:00",
    ],
    "amount": [250, 180, 90, 300],
})
 
print(df.dtypes)

Parse strings into datetime: to_datetimeto_datetime

Parse datetime
df["order_time"] = pd.to_datetime(df["order_time"], errors="coerce")
print(df.dtypes)
Parse datetime
df["order_time"] = pd.to_datetime(df["order_time"], errors="coerce")
print(df.dtypes)
  • errors="coerce"errors="coerce" converts invalid timestamps to NaTNaT (missing datetime).

Extract parts of datetime with .dt.dt

Datetime parts
df["date"] = df["order_time"].dt.date
 
df["year"] = df["order_time"].dt.year
df["month"] = df["order_time"].dt.month
df["day"] = df["order_time"].dt.day
df["hour"] = df["order_time"].dt.hour
 
df["weekday"] = df["order_time"].dt.day_name()
 
print(df)
Datetime parts
df["date"] = df["order_time"].dt.date
 
df["year"] = df["order_time"].dt.year
df["month"] = df["order_time"].dt.month
df["day"] = df["order_time"].dt.day
df["hour"] = df["order_time"].dt.hour
 
df["weekday"] = df["order_time"].dt.day_name()
 
print(df)

Group by date (daily totals)

Daily totals
daily = (
    df.groupby(df["order_time"].dt.date)
    .agg(total_amount=("amount", "sum"), orders=("order_id", "count"))
)
 
print(daily)
Daily totals
daily = (
    df.groupby(df["order_time"].dt.date)
    .agg(total_amount=("amount", "sum"), orders=("order_id", "count"))
)
 
print(daily)

Timezone quick note

If you work with timezones:

  • Store timestamps in UTC when possible
  • Convert for display/reporting
Timezone example
# df["order_time"] = df["order_time"].dt.tz_localize("UTC")
# df["order_time_ist"] = df["order_time"].dt.tz_convert("Asia/Kolkata")
Timezone example
# df["order_time"] = df["order_time"].dt.tz_localize("UTC")
# df["order_time_ist"] = df["order_time"].dt.tz_convert("Asia/Kolkata")

Common pitfalls

  • Don’t compare datetime strings if you can parse them.
  • Mixed formats can create parsing issues β€” use errors="coerce"errors="coerce" and investigate invalid rows.
  • Missing datetimes are stored as NaTNaT (not NaNNaN).

If this helped you, consider buying me a coffee β˜•

Buy me a coffee

Was this page helpful?

Let us know how we did