Global Terrorism Database Analysis

Goal

Use GTD-like incident data to:

Visualize incidents over time
Compare regions
Identify major trend changes

Important note

This is a sensitive dataset. Focus on high-level aggregated analysis and avoid sensationalizing.

Step 1: Load

Load

import pandas as pd
 
df = pd.read_csv("data/gtd.csv")
print(df.head())

Load

import pandas as pd
 
df = pd.read_csv("data/gtd.csv")
print(df.head())

Step 2: Incidents by year

Incidents per year

import matplotlib.pyplot as plt
 
yearly = df.groupby("year").size()
 
plt.figure(figsize=(10, 4))
plt.plot(yearly.index, yearly.values)
plt.title("Incidents by year")
plt.xlabel("Year")
plt.ylabel("Count")
plt.tight_layout()
plt.show()

Incidents per year

import matplotlib.pyplot as plt
 
yearly = df.groupby("year").size()
 
plt.figure(figsize=(10, 4))
plt.plot(yearly.index, yearly.values)
plt.title("Incidents by year")
plt.xlabel("Year")
plt.ylabel("Count")
plt.tight_layout()
plt.show()

Step 3: Compare regions

Top regions

import seaborn as sns
import matplotlib.pyplot as plt
 
regions = df["region"].value_counts().head(10).reset_index()
regions.columns = ["region", "count"]
 
plt.figure(figsize=(10, 4))
sns.barplot(data=regions, x="count", y="region")
plt.title("Top regions by incident count")
plt.tight_layout()
plt.show()

Top regions

import seaborn as sns
import matplotlib.pyplot as plt
 
regions = df["region"].value_counts().head(10).reset_index()
regions.columns = ["region", "count"]
 
plt.figure(figsize=(10, 4))
sns.barplot(data=regions, x="count", y="region")
plt.title("Top regions by incident count")
plt.tight_layout()
plt.show()

Deliverable

Overall trend
Top regions by incidents
Notes about missingness/reporting bias

If this helped you, consider buying me a coffee ☕

Buy me a coffee