Handling Missing Data (isna, fillna, dropna)
Why missing data is normal
Missing values happen because of:
- Optional form fields (e.g., phone number)
- Data entry errors
- Failed joins/merges
- Incomplete logs
Pandas typically represents missing values as NaNNaN (and sometimes NoneNone).
Setup example with missing values
Missing values example
import pandas as pd
import numpy as np
df = pd.DataFrame({
"name": ["Asha", "Ravi", "Meera", None],
"age": [23, np.nan, 26, 31],
"city": ["Pune", "Delhi", None, "Pune"],
"score": [88, 91, np.nan, 95],
})
print(df)Missing values example
import pandas as pd
import numpy as np
df = pd.DataFrame({
"name": ["Asha", "Ravi", "Meera", None],
"age": [23, np.nan, 26, 31],
"city": ["Pune", "Delhi", None, "Pune"],
"score": [88, 91, np.nan, 95],
})
print(df)Detect missing values
isna()isna() / isnull()isnull()
isna
print(df.isna())isna
print(df.isna())Count missing per column
Missing counts
print(df.isna().sum())Missing counts
print(df.isna().sum())Remove missing values: dropna()dropna()
Drop rows with any missing values
Drop rows with any NA
clean = df.dropna()
print(clean)Drop rows with any NA
clean = df.dropna()
print(clean)Drop rows where a specific column is missing
Drop rows where score is missing
clean = df.dropna(subset=["score"])
print(clean)Drop rows where score is missing
clean = df.dropna(subset=["score"])
print(clean)Fill missing values: fillna()fillna()
Fill with a constant
Fill missing city
filled = df.copy()
filled["city"] = filled["city"].fillna("Unknown")
print(filled)Fill missing city
filled = df.copy()
filled["city"] = filled["city"].fillna("Unknown")
print(filled)Fill numeric missing values with mean/median
Fill numeric with median
filled = df.copy()
filled["score"] = filled["score"].fillna(filled["score"].median())
print(filled)Fill numeric with median
filled = df.copy()
filled["score"] = filled["score"].fillna(filled["score"].median())
print(filled)Forward fill / backward fill
Useful for time series or repeated categories.
Forward fill
filled = df.copy()
filled["city"] = filled["city"].ffill()
print(filled)Forward fill
filled = df.copy()
filled["city"] = filled["city"].ffill()
print(filled)Important: be explicit
Before deciding how to handle missing data, ask:
- Is missingness random or meaningful?
- Should missing values be removed or imputed?
- Will filling change the meaning of the data?
In analytics, documenting missing-data decisions is part of good practice.
๐งช Try It Yourself
Exercise 1 โ Create a DataFrame
Exercise 2 โ Select a Column
Exercise 3 โ Filter Rows
If this helped you, consider buying me a coffee โ
Buy me a coffeeWas this page helpful?
Let us know how we did
