Data Science Pandas

4.1 Data Manipulation with Pandas – Sorting Data

📊 Data Manipulation with Pandas: Sorting Data

🔍 Introduction

Pandas is a powerful Python library for data manipulation and analysis. One of the essential operations when working with data is sorting. Sorting helps in organizing data to make it more meaningful and easier to analyze. Whether you need to arrange numerical values in ascending order or sort text-based data alphabetically, Pandas provides efficient methods to achieve this. The sort_values() and sort_index() functions in Pandas allow users to sort data based on column values or index labels. Sorting can be done in both ascending and descending order, with additional options to handle missing values effectively.

In this tutorial, we will explore how to sort data in Pandas with two practical examples:

  1. 📌 Sorting a DataFrame based on a single column.
  2. 📌 Sorting a DataFrame based on multiple columns.

📌 Example 1: Sorting a DataFrame by a Single Column

Let’s consider a simple dataset of students and their scores:

import pandas as pd

# Creating a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'],
        'Score': [85, 90, 78, 92]}
df = pd.DataFrame(data)

# Sorting by Score in ascending order
df_sorted = df.sort_values(by='Score')
print(df_sorted)

✅ Output:

     Name  Score
2  Charlie     78
0    Alice     85
1      Bob     90
3    David     92

Here, the sort_values() function sorts the DataFrame based on the ‘Score’ column in ascending order. You can set ascending=False to sort in descending order.

📌 Example 2: Sorting by Multiple Columns

Now, let’s sort a dataset based on two columns: Score (descending) and Name (ascending).

# Sorting by Score (descending) and Name (ascending)
df_sorted = df.sort_values(by=['Score', 'Name'], ascending=[False, True])
print(df_sorted)

✅ Output:

     Name  Score
3   David     92
1     Bob     90
0   Alice     85
2 Charlie     78

Here, the DataFrame is first sorted by the ‘Score’ column in descending order. If two students have the same score, they are further sorted alphabetically by ‘Name’.

📌 Summary

🔹 Sorting is a fundamental data manipulation technique that enhances data readability and usability. 🔹 Pandas provides the sort_values() method to sort data by one or multiple columns efficiently. 🔹 You can control the sorting order and manage missing values as needed.

Mastering sorting techniques in Pandas will significantly improve your data analysis workflow, making it easier to extract insights and trends from structured data. 🚀

 

Leave a Reply

Your email address will not be published. Required fields are marked *