Global Terrorism Database Analysis
Goal
Use GTD-like incident data to:
- Visualize incidents over time
- Compare regions
- Identify major trend changes
Important note
This is a sensitive dataset. Focus on high-level aggregated analysis and avoid sensationalizing.
Step 1: Load
Load
import pandas as pd
df = pd.read_csv("data/gtd.csv")
print(df.head())Load
import pandas as pd
df = pd.read_csv("data/gtd.csv")
print(df.head())Step 2: Incidents by year
Incidents per year
import matplotlib.pyplot as plt
yearly = df.groupby("year").size()
plt.figure(figsize=(10, 4))
plt.plot(yearly.index, yearly.values)
plt.title("Incidents by year")
plt.xlabel("Year")
plt.ylabel("Count")
plt.tight_layout()
plt.show()Incidents per year
import matplotlib.pyplot as plt
yearly = df.groupby("year").size()
plt.figure(figsize=(10, 4))
plt.plot(yearly.index, yearly.values)
plt.title("Incidents by year")
plt.xlabel("Year")
plt.ylabel("Count")
plt.tight_layout()
plt.show()Step 3: Compare regions
Top regions
import seaborn as sns
import matplotlib.pyplot as plt
regions = df["region"].value_counts().head(10).reset_index()
regions.columns = ["region", "count"]
plt.figure(figsize=(10, 4))
sns.barplot(data=regions, x="count", y="region")
plt.title("Top regions by incident count")
plt.tight_layout()
plt.show()Top regions
import seaborn as sns
import matplotlib.pyplot as plt
regions = df["region"].value_counts().head(10).reset_index()
regions.columns = ["region", "count"]
plt.figure(figsize=(10, 4))
sns.barplot(data=regions, x="count", y="region")
plt.title("Top regions by incident count")
plt.tight_layout()
plt.show()Deliverable
- Overall trend
- Top regions by incidents
- Notes about missingness/reporting bias
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
