ML Chapter 7 | Bias vs Variance, Underfitting & Overfitting, Regularization Techniques

Chapter 7: Bias–Variance Tradeoff

One of the most important and misunderstood concepts in Machine Learning is the Bias–Variance Tradeoff.
It determines whether your model generalizes well or performs poorly on new unseen data.
In this chapter, we explore Underfitting, Overfitting, Bias, Variance, and Regularization techniques.

1. What is the Bias–Variance Tradeoff?

When building ML models, two types of errors affect performance:

Bias – Error due to overly simple assumptions
Variance – Error due to too much sensitivity to training data

A good machine learning model should maintain a balance between bias and variance.

2. Underfitting (High Bias)

Underfitting happens when the model is too simple to capture the patterns in the data.

✔ Symptoms

Poor accuracy on training data
Poor accuracy on test data
Model is too simple

✔ Causes

Too few features
Too much regularization
Using a simple model (Linear Regression for non-linear data)

✔ Example


from sklearn.linear_model import LinearRegression
import numpy as np

X = np.array([[1],[2],[3],[4],[5]])
y = np.array([1,4,9,16,25])  # quadratic relationship

model = LinearRegression()
model.fit(X, y)

print(model.predict([[6]]))  # Bad prediction due to underfitting

A straight line cannot learn a curved pattern → high bias.

3. Overfitting (High Variance)

Overfitting occurs when the model memorizes the training data instead of learning general patterns.

✔ Symptoms

High accuracy on training data
Low accuracy on test data
Model is too complex

✔ Causes

Too many features
Deep decision trees
Small training set
Noise in data

✔ Example


from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(max_depth=None)  # no limit → overfit
model.fit(X_train, y_train)

print(model.score(X_train, y_train))  # Very high
print(model.score(X_test, y_test))    # Very low

The model memorizes every detail — high variance.

4. Visual Understanding

✔ High Bias (Underfitting)

Model is too simple
High training error
High test error

✔ High Variance (Overfitting)

Model is too complex
Low training error
High test error

✔ Balanced Model

Moderate complexity
Low training error
Low test error

5. Techniques to Reduce Bias (Underfitting)

Add more features
Use more complex models (Random Forest, SVM)
Reduce regularization strength
Train longer (for neural networks)

6. Techniques to Reduce Variance (Overfitting)

Collect more data
Use cross-validation
Simplify the model
Use Regularization (L1, L2)
Use Dropout (for neural networks)
Pruning in Decision Trees

7. Regularization Techniques

Regularization helps reduce overfitting by penalizing large weights in the model.

✔ L2 Regularization (Ridge)

Adds a penalty proportional to the square of the coefficients.


from sklearn.linear_model import Ridge

model = Ridge(alpha=1.0)
model.fit(X_train, y_train)

✔ L1 Regularization (Lasso)

Adds a penalty proportional to the absolute value of the coefficients.
It can shrink some weights to zero → feature selection.


from sklearn.linear_model import Lasso

model = Lasso(alpha=0.1)
model.fit(X_train, y_train)

✔ Elastic Net (L1 + L2)


from sklearn.linear_model import ElasticNet

model = ElasticNet(alpha=0.1, l1_ratio=0.5)
model.fit(X_train, y_train)

8. Practical Tips

Use cross-validation to detect overfitting
Plot learning curves to visualize bias/variance
Try simpler models before complex ones
Use regularization with linear models
Limit depth of decision trees
Use ensemble models (Random Forest) for stability

9. Summary Table

Problem Type	Bias	Variance	Solution
Underfitting	High	Low	Increase model complexity
Overfitting	Low	High	Apply regularization / simplify model
Balanced	Low	Low	Ideal model

Conclusion

The Bias–Variance Tradeoff is central to building high-quality machine learning models.
Understanding how model complexity influences bias and variance helps you create models that generalize well.

Assignments

Assignment 1 – Explain Bias vs Variance

Define in your own words what “bias” and “variance” mean in ML context. Give one example each (for bias and for variance) using simple hypothetical data.

Hint: Bias → oversimplified model assumptions → underfitting. Variance → model too sensitive to data → overfitting. :contentReference[oaicite:0]{index=0}

Assignment 2 – Underfitting Example

Create or imagine a dataset where target depends on a non-linear relationship. Then pick a linear model (e.g. simple linear regression) — explain why this model will underfit, and what consequences for train & test performance would be.

Hint: Underfitting arises when model is too simple to capture true patterns → high error on both train & test. :contentReference[oaicite:1]{index=1}

Assignment 3 – Overfitting Example

Design a situation (realistic or hypothetical) where a model might overfit. Describe the data characteristics, model complexity, and what you expect train vs test performance to look like.

Hint: Overfitting often occurs with complex models + small or noisy data, leading to very good training performance but poor generalization. :contentReference[oaicite:2]{index=2}

Assignment 4 – Visualize Bias–Variance Tradeoff

Draw (on paper or using a plotting tool) a “bias vs variance” diagram or graph — show zones of underfitting, optimal, and overfitting. Explain in your own words what each zone represents.

Hint: Use the classic U-shaped error curve (error vs model complexity) and indicate where bias dominates (underfitting) vs variance dominates (overfitting). :contentReference[oaicite:3]{index=3}

Assignment 5 – Regularization Techniques Write-up

List and describe at least two regularization techniques used to reduce overfitting. Explain in which scenarios each would be useful.

Hint: Consider methods like L1 (Lasso), L2 (Ridge), early-stopping, model complexity reduction. :contentReference[oaicite:4]{index=4}

Assignment 6 – Impact of Regularization on Model Complexity

Imagine you have a regression model with many features. Explain how applying regularization might change the model’s coefficients and its generalization ability.

Hint: Regularization adds penalty to complexity → may shrink coefficient magnitudes, avoid overfitting, improve test performance while possibly increasing training error. :contentReference[oaicite:5]{index=5}

Assignment 7 – Compare Two Models: With vs Without Regularization

Take any dataset (real or hypothetical). Describe two modelling scenarios: (a) without regularization, (b) with regularization. Predict and explain how model performance (train & test) might differ.

Hint: Without regularization → risk of overfitting (low train error, high test error). With regularization → possibly higher train error but better generalization. :contentReference[oaicite:6]{index=6}

Assignment 8 – When to Accept Bias vs Variance Tradeoff

Suppose you have a critical ML application (like disease diagnosis). Would you prefer a model with slightly higher bias or slightly higher variance? Explain your reasoning.

Hint: Think about costs of false positives vs false negatives, and generalization risk on unseen/new data. Balance generalization vs complexity carefully. :contentReference[oaicite:7]{index=7}

Assignment 9 – Effect of Data Size on Overfitting/Underfitting

Write a short essay: how does the size of training data influence whether a model underfits or overfits? What changes when you increase the amount of data vs keeping it small?

Hint: More data reduces risk of overfitting by reducing variance and giving better representation; small datasets amplify noise → risk of overfitting or underfitting depending on model complexity. :contentReference[oaicite:8]{index=8}

Assignment 10 – Real-World Case Study: Overfitting Consequences

Think of a real-world application (spam detection, stock price prediction, medical diagnosis, etc.). Describe how overfitting might mislead results if not properly handled, and suggest steps to avoid it.

Hint: Overfitting may cause model to memorize noise or anomalies — leading to wrong predictions on new data. Use regularization, cross-validation, simpler models or more data to avoid this. :contentReference[oaicite:9]{index=9}

About Us

Our Location

ML Chapter 7 | Bias vs Variance, Underfitting & Overfitting, Regularization Techniques

Chapter 7: Bias–Variance Tradeoff

1. What is the Bias–Variance Tradeoff?

2. Underfitting (High Bias)

✔ Symptoms

✔ Causes

✔ Example

3. Overfitting (High Variance)

✔ Symptoms

✔ Causes

✔ Example

4. Visual Understanding

✔ High Bias (Underfitting)

✔ High Variance (Overfitting)

✔ Balanced Model

5. Techniques to Reduce Bias (Underfitting)

6. Techniques to Reduce Variance (Overfitting)

7. Regularization Techniques

✔ L2 Regularization (Ridge)

✔ L1 Regularization (Lasso)

✔ Elastic Net (L1 + L2)

8. Practical Tips

9. Summary Table

Conclusion

Assignments

Assignment 1 – Explain Bias vs Variance

Assignment 2 – Underfitting Example

Assignment 3 – Overfitting Example

Assignment 4 – Visualize Bias–Variance Tradeoff

Assignment 5 – Regularization Techniques Write-up

Assignment 6 – Impact of Regularization on Model Complexity

Assignment 7 – Compare Two Models: With vs Without Regularization

Assignment 8 – When to Accept Bias vs Variance Tradeoff

Assignment 9 – Effect of Data Size on Overfitting/Underfitting

Assignment 10 – Real-World Case Study: Overfitting Consequences

Leave a Reply Cancel reply

Our Courses

About Us

Our Location

Social

ML Chapter 7 | Bias vs Variance, Underfitting & Overfitting, Regularization Techniques

Chapter 7: Bias–Variance Tradeoff

1. What is the Bias–Variance Tradeoff?

2. Underfitting (High Bias)

✔ Symptoms

✔ Causes

✔ Example

3. Overfitting (High Variance)

✔ Symptoms

✔ Causes

✔ Example

4. Visual Understanding

✔ High Bias (Underfitting)

✔ High Variance (Overfitting)

✔ Balanced Model

5. Techniques to Reduce Bias (Underfitting)

6. Techniques to Reduce Variance (Overfitting)

7. Regularization Techniques

✔ L2 Regularization (Ridge)

✔ L1 Regularization (Lasso)

✔ Elastic Net (L1 + L2)

8. Practical Tips

9. Summary Table

Conclusion

Assignments

Assignment 1 – Explain Bias vs Variance

Assignment 2 – Underfitting Example

Assignment 3 – Overfitting Example

Assignment 4 – Visualize Bias–Variance Tradeoff

Assignment 5 – Regularization Techniques Write-up

Assignment 6 – Impact of Regularization on Model Complexity

Assignment 7 – Compare Two Models: With vs Without Regularization

Assignment 8 – When to Accept Bias vs Variance Tradeoff

Assignment 9 – Effect of Data Size on Overfitting/Underfitting

Assignment 10 – Real-World Case Study: Overfitting Consequences

Leave a Reply Cancel reply

Related Post