E-commerce Sales Analysis
Goal
Given an e-commerce orders dataset, answer:
- Overall revenue and order trends
- Top products / categories
- Average order value (AOV)
- Revenue by segment (city, category)
Example columns
order_idorder_id,order_dateorder_date,customer_idcustomer_idproductproduct,categorycategory,qtyqty,pricepricecitycity,statusstatus
Step 1: Load and basic prep
Load
import pandas as pd
df = pd.read_csv("data/orders.csv")
df["order_date"] = pd.to_datetime(df["order_date"], errors="coerce")
df["revenue"] = df["qty"] * df["price"]
print(df.shape)
print(df.head())Load
import pandas as pd
df = pd.read_csv("data/orders.csv")
df["order_date"] = pd.to_datetime(df["order_date"], errors="coerce")
df["revenue"] = df["qty"] * df["price"]
print(df.shape)
print(df.head())Step 2: KPIs
KPIs
orders = df["order_id"].nunique()
customers = df["customer_id"].nunique()
revenue = df["revenue"].sum()
aov = revenue / orders
print("orders:", orders)
print("customers:", customers)
print("revenue:", revenue)
print("aov:", aov)KPIs
orders = df["order_id"].nunique()
customers = df["customer_id"].nunique()
revenue = df["revenue"].sum()
aov = revenue / orders
print("orders:", orders)
print("customers:", customers)
print("revenue:", revenue)
print("aov:", aov)Step 3: Revenue trend
Monthly revenue
import matplotlib.pyplot as plt
monthly = (
df.dropna(subset=["order_date"])
.groupby(df["order_date"].dt.to_period("M"))
.agg(revenue=("revenue", "sum"), orders=("order_id", "nunique"))
.reset_index()
)
monthly["order_date"] = monthly["order_date"].astype(str)
plt.figure(figsize=(10, 4))
plt.plot(monthly["order_date"], monthly["revenue"], marker="o")
plt.title("Monthly revenue")
plt.xticks(rotation=30)
plt.tight_layout()
plt.show()Monthly revenue
import matplotlib.pyplot as plt
monthly = (
df.dropna(subset=["order_date"])
.groupby(df["order_date"].dt.to_period("M"))
.agg(revenue=("revenue", "sum"), orders=("order_id", "nunique"))
.reset_index()
)
monthly["order_date"] = monthly["order_date"].astype(str)
plt.figure(figsize=(10, 4))
plt.plot(monthly["order_date"], monthly["revenue"], marker="o")
plt.title("Monthly revenue")
plt.xticks(rotation=30)
plt.tight_layout()
plt.show()Step 4: Top products
Top products
import seaborn as sns
import matplotlib.pyplot as plt
top = (
df.groupby("product")
.agg(revenue=("revenue", "sum"))
.sort_values("revenue", ascending=False)
.head(10)
.reset_index()
)
plt.figure(figsize=(10, 4))
sns.barplot(data=top, x="revenue", y="product")
plt.title("Top 10 products by revenue")
plt.tight_layout()
plt.show()Top products
import seaborn as sns
import matplotlib.pyplot as plt
top = (
df.groupby("product")
.agg(revenue=("revenue", "sum"))
.sort_values("revenue", ascending=False)
.head(10)
.reset_index()
)
plt.figure(figsize=(10, 4))
sns.barplot(data=top, x="revenue", y="product")
plt.title("Top 10 products by revenue")
plt.tight_layout()
plt.show()Deliverable
Summarize:
- Revenue trend (up/down)
- Main revenue drivers (top products/categories)
- Any anomalies (returns, cancellations)
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
