Skip to content

Reshaping Data (pivot, pivot_table, melt)

Long vs wide (why reshaping exists)

  • Long format: one row per observation (better for analysis/plotting)
  • Wide format: values spread across columns (good for reporting)

Example: long format

Long format
import pandas as pd
 
df = pd.DataFrame({
    "date": ["2025-01-01", "2025-01-01", "2025-01-02", "2025-01-02"],
    "city": ["pune", "delhi", "pune", "delhi"],
    "sales": [100, 120, 90, 150],
})
 
print(df)
Long format
import pandas as pd
 
df = pd.DataFrame({
    "date": ["2025-01-01", "2025-01-01", "2025-01-02", "2025-01-02"],
    "city": ["pune", "delhi", "pune", "delhi"],
    "sales": [100, 120, 90, 150],
})
 
print(df)

pivotpivot: reshape when data is unique

pivot
wide = df.pivot(index="date", columns="city", values="sales")
print(wide)
pivot
wide = df.pivot(index="date", columns="city", values="sales")
print(wide)

If the combination of (indexindex, columnscolumns) isn’t unique, pivotpivot throws an error.

pivot_tablepivot_table: reshape with aggregation

pivot_table
wide = df.pivot_table(index="date", columns="city", values="sales", aggfunc="sum")
print(wide)
pivot_table
wide = df.pivot_table(index="date", columns="city", values="sales", aggfunc="sum")
print(wide)

meltmelt: wide → long

melt
wide = pd.DataFrame({
    "date": ["2025-01-01", "2025-01-02"],
    "pune": [100, 90],
    "delhi": [120, 150],
})
 
long = wide.melt(id_vars=["date"], var_name="city", value_name="sales")
print(long)
melt
wide = pd.DataFrame({
    "date": ["2025-01-01", "2025-01-02"],
    "pune": [100, 90],
    "delhi": [120, 150],
})
 
long = wide.melt(id_vars=["date"], var_name="city", value_name="sales")
print(long)

Practical notes

  • Many ML/EDA workflows prefer long format.
  • Pivoted tables often have a multi-index for columns—use reset_index()reset_index() or wide.columnswide.columns renaming if needed.

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did