Pandas

4.Pandas Series: Handling Missing Values

Why Handle Missing Values?

Missing values (NaN – Not a Number) can cause errors in data analysis and affect results. Pandas provides various methods to handle missing values in a Series, such as detecting, filling, or removing them.


1. Detecting Missing Values

We can check for missing values using isna() or isnull(), which return a boolean Series indicating the presence of NaN values.

Example 1: Detecting Missing Values

import pandas as pd  
import numpy as np  

# Creating a Series with missing values  
data = pd.Series([10, np.nan, 20, None, 30])  

# Checking for missing values  
print(data.isna())  # OR data.isnull()

Output:

0    False  
1     True  
2    False  
3     True  
4    False  
dtype: bool  

🔹 True indicates a missing value (NaN).


2. Filling Missing Values

We can replace missing values using fillna() with a specific value, forward fill, or backward fill.

Example 2: Filling Missing Values with a Fixed Value

import pandas as pd  
import numpy as np  

# Creating a Series with missing values  
data = pd.Series([10, np.nan, 20, None, 30])  

# Filling NaN with 0  
filled_data = data.fillna(0)  

print(filled_data)

Output:

0    10.0  
1     0.0  
2    20.0  
3     0.0  
4    30.0  
dtype: float64  

🔹 fillna(0) replaces NaN values with 0.

Example 3: Forward Fill (Fill with Previous Value)

# Forward fill (fills NaN with previous value)
forward_filled = data.fillna(method='ffill')

print(forward_filled)

Output:

0    10.0  
1    10.0  
2    20.0  
3    20.0  
4    30.0  
dtype: float64  

🔹 ffill (forward fill) replaces NaN with the previous non-null value.

Example 4: Backward Fill (Fill with Next Value)

# Backward fill (fills NaN with next value)
backward_filled = data.fillna(method='bfill')

print(backward_filled)

Output:

0    10.0  
1    20.0  
2    20.0  
3    30.0  
4    30.0  
dtype: float64  

🔹 bfill (backward fill) replaces NaN with the next non-null value.


3. Dropping Missing Values

We can remove missing values using dropna().

Example 5: Removing Missing Values

# Dropping NaN values
cleaned_data = data.dropna()

print(cleaned_data)

Output:

0    10.0  
2    20.0  
4    30.0  
dtype: float64  

🔹 dropna() removes all missing values, keeping only valid data.


Key Takeaways:

isna() detects missing values.
fillna() replaces missing values with a fixed value or fills using ffill/bfill.
dropna() removes missing values from the Series.

Leave a Reply

Your email address will not be published. Required fields are marked *