Skip to content

Spotify Song Popularity Analysis

Goal

Given Spotify song features, answer:

  • What does popularity distribution look like?
  • Which audio features correlate with popularity?
  • Do genres differ in average popularity?

Step 1: Load

Load spotify data
import pandas as pd
 
df = pd.read_csv("data/spotify.csv")
print(df.head())
Load spotify data
import pandas as pd
 
df = pd.read_csv("data/spotify.csv")
print(df.head())

Step 2: Popularity distribution

Popularity distribution
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(7, 4))
sns.histplot(df["popularity"], bins=30, kde=True)
plt.title("Popularity distribution")
plt.tight_layout()
plt.show()
Popularity distribution
import seaborn as sns
import matplotlib.pyplot as plt
 
plt.figure(figsize=(7, 4))
sns.histplot(df["popularity"], bins=30, kde=True)
plt.title("Popularity distribution")
plt.tight_layout()
plt.show()

Step 3: Correlations

Correlation
import seaborn as sns
import matplotlib.pyplot as plt
 
num = df.select_dtypes(include="number")
 
plt.figure(figsize=(8, 6))
sns.heatmap(num.corr(), cmap="coolwarm", center=0)
plt.title("Correlation heatmap")
plt.tight_layout()
plt.show()
Correlation
import seaborn as sns
import matplotlib.pyplot as plt
 
num = df.select_dtypes(include="number")
 
plt.figure(figsize=(8, 6))
sns.heatmap(num.corr(), cmap="coolwarm", center=0)
plt.title("Correlation heatmap")
plt.tight_layout()
plt.show()

Deliverable

  • Top correlated features
  • Whether correlations are meaningful or driven by outliers

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did