Why Handle Missing Values?
Missing values (NaN – Not a Number) can cause errors in data analysis and affect results. Pandas provides various methods to handle missing values in a Series, such as detecting, filling, or removing them.
1. Detecting Missing Values
We can check for missing values using isna() or isnull(), which return a boolean Series indicating the presence of NaN values.
Example 1: Detecting Missing Values
import pandas as pd
import numpy as np
# Creating a Series with missing values
data = pd.Series([10, np.nan, 20, None, 30])
# Checking for missing values
print(data.isna()) # OR data.isnull()
Output:
0 False
1 True
2 False
3 True
4 False
dtype: bool
🔹 True indicates a missing value (NaN).
2. Filling Missing Values
We can replace missing values using fillna() with a specific value, forward fill, or backward fill.
Example 2: Filling Missing Values with a Fixed Value
import pandas as pd
import numpy as np
# Creating a Series with missing values
data = pd.Series([10, np.nan, 20, None, 30])
# Filling NaN with 0
filled_data = data.fillna(0)
print(filled_data)
Output:
0 10.0
1 0.0
2 20.0
3 0.0
4 30.0
dtype: float64
🔹 fillna(0) replaces NaN values with 0.
Example 3: Forward Fill (Fill with Previous Value)
# Forward fill (fills NaN with previous value)
forward_filled = data.fillna(method='ffill')
print(forward_filled)
Output:
0 10.0
1 10.0
2 20.0
3 20.0
4 30.0
dtype: float64
🔹 ffill (forward fill) replaces NaN with the previous non-null value.
Example 4: Backward Fill (Fill with Next Value)
# Backward fill (fills NaN with next value)
backward_filled = data.fillna(method='bfill')
print(backward_filled)
Output:
0 10.0
1 20.0
2 20.0
3 30.0
4 30.0
dtype: float64
🔹 bfill (backward fill) replaces NaN with the next non-null value.
3. Dropping Missing Values
We can remove missing values using dropna()
.
Example 5: Removing Missing Values
# Dropping NaN values
cleaned_data = data.dropna()
print(cleaned_data)
Output:
0 10.0
2 20.0
4 30.0
dtype: float64
🔹 dropna() removes all missing values, keeping only valid data.
Key Takeaways:
✅ isna() detects missing values.
✅ fillna() replaces missing values with a fixed value or fills using ffill/bfill.
✅ dropna() removes missing values from the Series.