Matplotlib

Matplotlib Tutorial – Chapter 3: Plot Types and Data Representation

Chapter 3 – Plot Types and Data Representation in Matplotlib

Welcome to Chapter 3 of our Matplotlib tutorial series. By now, you understand how figures and axes work and how to use subplots effectively. In this chapter, we’ll focus on the heart of data visualization — the different types of plots Matplotlib provides and how to represent various data patterns using them.

Matplotlib supports dozens of chart types, but as a beginner, it’s essential to master the most commonly used ones first. These include:

  • Line Plot
  • Scatter Plot
  • Bar and Barh (Horizontal Bar) Plot
  • Histogram
  • Pie Chart
  • Box Plot
  • Area Plot

Each plot serves a specific purpose and tells a story about your data. Let’s explore them one by one with practical examples and customization options.

✅ 1. Line Plot

The line plot is the most fundamental and frequently used type of chart. It is ideal for visualizing trends, continuous data, or time series — for example, sales across months or temperature over time.

Creating a simple line plot is straightforward:

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [10, 15, 20, 25, 30]

plt.plot(x, y, color='blue', marker='o')
plt.title("Simple Line Plot")
plt.xlabel("X-Axis")
plt.ylabel("Y-Axis")
plt.show()

Here’s what’s happening:

  • marker='o' adds circle markers at each data point.
  • color='blue' sets the line color.
  • plt.title(), plt.xlabel(), and plt.ylabel() give context to the graph.

You can add multiple lines in one plot to compare datasets:

x = [1, 2, 3, 4, 5]
plt.plot(x, [10,20,25,30,35], label='Product A')
plt.plot(x, [8,18,22,28,32], label='Product B')
plt.legend()
plt.show()

When comparing time-based data or trends, always ensure lines use consistent colors and legends for clarity.

✅ 2. Scatter Plot

A scatter plot shows the relationship between two variables by displaying points on a Cartesian plane. It’s great for identifying patterns, clusters, or correlations in data.

import matplotlib.pyplot as plt

x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,100,86,103,87,94,78,77,85,86]

plt.scatter(x, y, color='purple')
plt.title("Scatter Plot Example")
plt.xlabel("X-Values")
plt.ylabel("Y-Values")
plt.show()

Each point represents a single observation. The pattern of dots can indicate correlations — positive, negative, or no correlation.

You can also vary the size and color of points to add extra dimensions of information:

sizes = [20,50,100,200,500,1000,60,80,120,300,400,500,600]
colors = [10,20,30,40,50,60,70,80,90,100,110,120,130]

plt.scatter(x, y, s=sizes, c=colors, cmap='viridis', alpha=0.6, edgecolors='black')
plt.colorbar(label='Value')
plt.show()

Here, the size (s) and color (c) add depth — useful when dealing with multi-variable data. The alpha parameter controls transparency, making overlapping points visible.

✅ 3. Bar Plot

Bar charts are used to compare quantities among categories — like sales per region or students per class. Bars can be vertical or horizontal.

Vertical Bars:

categories = ['A', 'B', 'C', 'D', 'E']
values = [10, 24, 36, 40, 5]

plt.bar(categories, values, color='orange')
plt.title("Vertical Bar Chart")
plt.xlabel("Category")
plt.ylabel("Value")
plt.show()

Horizontal Bars:

plt.barh(categories, values, color='teal')
plt.title("Horizontal Bar Chart")
plt.xlabel("Value")
plt.ylabel("Category")
plt.show()

Bar charts make category comparison intuitive. To highlight data, you can use different colors or annotate values on top of bars:

bars = plt.bar(categories, values, color='skyblue')
for bar in bars:
    yval = bar.get_height()
    plt.text(bar.get_x() + bar.get_width()/2, yval + 1, yval, ha='center')
plt.show()

✅ 4. Histogram

A histogram displays the distribution of numerical data by dividing it into bins. It helps reveal patterns like skewness, normal distribution, or outliers.

import numpy as np
data = np.random.randn(1000)

plt.hist(data, bins=30, color='steelblue', edgecolor='black')
plt.title("Histogram Example")
plt.xlabel("Value Range")
plt.ylabel("Frequency")
plt.show()

Each bar’s height represents how many data points fall within that bin’s range. Use bins to control the granularity — fewer bins simplify the pattern, while more bins reveal finer details.

Histograms are especially valuable in data analysis and machine learning for understanding how features are distributed before modeling.

✅ 5. Pie Chart

Pie charts display parts of a whole. They are circular charts divided into slices, where each slice represents a category’s percentage contribution.

labels = ['Apples', 'Bananas', 'Cherries', 'Dates']
sizes = [30, 25, 2]()

Leave a Reply

Your email address will not be published. Required fields are marked *