Skip to content

Uber Ride Data Analysis

Goal

Given Uber ride/trip records:

  • Find peak hours and weekdays
  • Visualize trip volume over time
  • Identify hotspots (if location data exists)

Step 1: Load and parse timestamps

Load trips
import pandas as pd
 
df = pd.read_csv("data/uber.csv")
 
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
print(df.head())
Load trips
import pandas as pd
 
df = pd.read_csv("data/uber.csv")
 
df["timestamp"] = pd.to_datetime(df["timestamp"], errors="coerce")
print(df.head())

Step 2: Extract time features

Time features
df["hour"] = df["timestamp"].dt.hour
df["weekday"] = df["timestamp"].dt.day_name()
Time features
df["hour"] = df["timestamp"].dt.hour
df["weekday"] = df["timestamp"].dt.day_name()

Step 3: Plot trips by hour

Trips by hour
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(8, 4))
sns.countplot(data=df, x="hour")
plt.title("Trips by hour")
plt.tight_layout()
plt.show()
Trips by hour
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(8, 4))
sns.countplot(data=df, x="hour")
plt.title("Trips by hour")
plt.tight_layout()
plt.show()

Step 4: Trend over dates

Trips by day
import matplotlib.pyplot as plt
 
daily = df.dropna(subset=["timestamp"]).groupby(df["timestamp"].dt.date).size()
 
plt.figure(figsize=(10, 4))
plt.plot(daily.index, daily.values)
plt.title("Trips over time")
plt.xticks(rotation=20)
plt.tight_layout()
plt.show()
Trips by day
import matplotlib.pyplot as plt
 
daily = df.dropna(subset=["timestamp"]).groupby(df["timestamp"].dt.date).size()
 
plt.figure(figsize=(10, 4))
plt.plot(daily.index, daily.values)
plt.title("Trips over time")
plt.xticks(rotation=20)
plt.tight_layout()
plt.show()

Deliverable

  • Peak hours
  • Peak weekdays
  • Trend changes (seasonality/events)

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did