Chapter 3 – Plot Types and Data Representation in Matplotlib
Welcome to Chapter 3 of our Matplotlib tutorial series. By now, you understand how figures and axes work and how to use subplots effectively. In this chapter, we’ll focus on the heart of data visualization — the different types of plots Matplotlib provides and how to represent various data patterns using them.
Matplotlib supports dozens of chart types, but as a beginner, it’s essential to master the most commonly used ones first. These include:
- Line Plot
- Scatter Plot
- Bar and Barh (Horizontal Bar) Plot
- Histogram
- Pie Chart
- Box Plot
- Area Plot
Each plot serves a specific purpose and tells a story about your data. Let’s explore them one by one with practical examples and customization options.
✅ 1. Line Plot
The line plot is the most fundamental and frequently used type of chart. It is ideal for visualizing trends, continuous data, or time series — for example, sales across months or temperature over time.
Creating a simple line plot is straightforward:
import matplotlib.pyplot as plt
x = [1, 2, 3, 4, 5]
y = [10, 15, 20, 25, 30]
plt.plot(x, y, color='blue', marker='o')
plt.title("Simple Line Plot")
plt.xlabel("X-Axis")
plt.ylabel("Y-Axis")
plt.show()
Here’s what’s happening:
marker='o'adds circle markers at each data point.color='blue'sets the line color.plt.title(),plt.xlabel(), andplt.ylabel()give context to the graph.
You can add multiple lines in one plot to compare datasets:
x = [1, 2, 3, 4, 5]
plt.plot(x, [10,20,25,30,35], label='Product A')
plt.plot(x, [8,18,22,28,32], label='Product B')
plt.legend()
plt.show()
When comparing time-based data or trends, always ensure lines use consistent colors and legends for clarity.
✅ 2. Scatter Plot
A scatter plot shows the relationship between two variables by displaying points on a Cartesian plane. It’s great for identifying patterns, clusters, or correlations in data.
import matplotlib.pyplot as plt
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,100,86,103,87,94,78,77,85,86]
plt.scatter(x, y, color='purple')
plt.title("Scatter Plot Example")
plt.xlabel("X-Values")
plt.ylabel("Y-Values")
plt.show()
Each point represents a single observation. The pattern of dots can indicate correlations — positive, negative, or no correlation.
You can also vary the size and color of points to add extra dimensions of information:
sizes = [20,50,100,200,500,1000,60,80,120,300,400,500,600]
colors = [10,20,30,40,50,60,70,80,90,100,110,120,130]
plt.scatter(x, y, s=sizes, c=colors, cmap='viridis', alpha=0.6, edgecolors='black')
plt.colorbar(label='Value')
plt.show()
Here, the size (s) and color (c) add depth — useful when dealing with multi-variable data. The alpha parameter controls transparency, making overlapping points visible.
✅ 3. Bar Plot
Bar charts are used to compare quantities among categories — like sales per region or students per class. Bars can be vertical or horizontal.
Vertical Bars:
categories = ['A', 'B', 'C', 'D', 'E']
values = [10, 24, 36, 40, 5]
plt.bar(categories, values, color='orange')
plt.title("Vertical Bar Chart")
plt.xlabel("Category")
plt.ylabel("Value")
plt.show()
Horizontal Bars:
plt.barh(categories, values, color='teal')
plt.title("Horizontal Bar Chart")
plt.xlabel("Value")
plt.ylabel("Category")
plt.show()
Bar charts make category comparison intuitive. To highlight data, you can use different colors or annotate values on top of bars:
bars = plt.bar(categories, values, color='skyblue')
for bar in bars:
yval = bar.get_height()
plt.text(bar.get_x() + bar.get_width()/2, yval + 1, yval, ha='center')
plt.show()
✅ 4. Histogram
A histogram displays the distribution of numerical data by dividing it into bins. It helps reveal patterns like skewness, normal distribution, or outliers.
import numpy as np
data = np.random.randn(1000)
plt.hist(data, bins=30, color='steelblue', edgecolor='black')
plt.title("Histogram Example")
plt.xlabel("Value Range")
plt.ylabel("Frequency")
plt.show()
Each bar’s height represents how many data points fall within that bin’s range. Use bins to control the granularity — fewer bins simplify the pattern, while more bins reveal finer details.
Histograms are especially valuable in data analysis and machine learning for understanding how features are distributed before modeling.
✅ 5. Pie Chart
Pie charts display parts of a whole. They are circular charts divided into slices, where each slice represents a category’s percentage contribution.
labels = ['Apples', 'Bananas', 'Cherries', 'Dates']
sizes = [30, 25, 2]()
