Working with Dates and Times (to_datetime, dt accessor)
Why datetime is a core analytics skill
Dates show up everywhere:
- Orders, sessions, support tickets
- Time-based KPIs (DAU/WAU/MAU)
- Trend analysis and seasonality
In real datasets, datetime values often start as strings, so you must parse them correctly.
Example dataset
Datetime example
import pandas as pd
df = pd.DataFrame({
"order_id": [101, 102, 103, 104],
"order_time": [
"2025-01-01 10:15:00",
"2025-01-01 10:45:00",
"2025-01-02 09:10:00",
"2025-01-03 18:30:00",
],
"amount": [250, 180, 90, 300],
})
print(df.dtypes)Datetime example
import pandas as pd
df = pd.DataFrame({
"order_id": [101, 102, 103, 104],
"order_time": [
"2025-01-01 10:15:00",
"2025-01-01 10:45:00",
"2025-01-02 09:10:00",
"2025-01-03 18:30:00",
],
"amount": [250, 180, 90, 300],
})
print(df.dtypes)Parse strings into datetime: to_datetimeto_datetime
Parse datetime
df["order_time"] = pd.to_datetime(df["order_time"], errors="coerce")
print(df.dtypes)Parse datetime
df["order_time"] = pd.to_datetime(df["order_time"], errors="coerce")
print(df.dtypes)errors="coerce"errors="coerce"converts invalid timestamps toNaTNaT(missing datetime).
Extract parts of datetime with .dt.dt
Datetime parts
df["date"] = df["order_time"].dt.date
df["year"] = df["order_time"].dt.year
df["month"] = df["order_time"].dt.month
df["day"] = df["order_time"].dt.day
df["hour"] = df["order_time"].dt.hour
df["weekday"] = df["order_time"].dt.day_name()
print(df)Datetime parts
df["date"] = df["order_time"].dt.date
df["year"] = df["order_time"].dt.year
df["month"] = df["order_time"].dt.month
df["day"] = df["order_time"].dt.day
df["hour"] = df["order_time"].dt.hour
df["weekday"] = df["order_time"].dt.day_name()
print(df)Group by date (daily totals)
Daily totals
daily = (
df.groupby(df["order_time"].dt.date)
.agg(total_amount=("amount", "sum"), orders=("order_id", "count"))
)
print(daily)Daily totals
daily = (
df.groupby(df["order_time"].dt.date)
.agg(total_amount=("amount", "sum"), orders=("order_id", "count"))
)
print(daily)Timezone quick note
If you work with timezones:
- Store timestamps in UTC when possible
- Convert for display/reporting
Timezone example
# df["order_time"] = df["order_time"].dt.tz_localize("UTC")
# df["order_time_ist"] = df["order_time"].dt.tz_convert("Asia/Kolkata")Timezone example
# df["order_time"] = df["order_time"].dt.tz_localize("UTC")
# df["order_time_ist"] = df["order_time"].dt.tz_convert("Asia/Kolkata")Common pitfalls
- Donβt compare datetime strings if you can parse them.
- Mixed formats can create parsing issues β use
errors="coerce"errors="coerce"and investigate invalid rows. - Missing datetimes are stored as
NaTNaT(notNaNNaN).
If this helped you, consider buying me a coffee β
Buy me a coffeeWas this page helpful?
Let us know how we did
