📖 Introduction
Pandas is a powerful Python library for data analysis and manipulation. When working with DataFrames, adding and removing columns is a common task. This tutorial will guide you through various ways to add new columns and remove existing ones using Pandas.
🗂️ 1. Creating a Sample DataFrame
Before we dive into adding and removing columns, let’s create a sample DataFrame to work with.
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data)
print(df)
✅ Output:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
4 Eve 28 Phoenix
Now, let’s explore how to add and remove columns in this DataFrame.
➕ 2. Adding Columns
There are multiple ways to add new columns to a Pandas DataFrame.
📌 Adding a Column with a Fixed Value
You can add a new column by assigning a constant value to it.
# Adding a new column with a fixed value
df['Country'] = 'USA'
print(df)
✅ Output:
Name Age City Country
0 Alice 25 New York USA
1 Bob 30 Los Angeles USA
2 Charlie 35 Chicago USA
3 David 40 Houston USA
4 Eve 28 Phoenix USA
📌 Adding a Column Based on Another Column
You can create a new column using existing data.
# Adding a column that calculates years to retirement
df['Years_to_Retire'] = 65 - df['Age']
print(df)
📌 Adding a Column Using apply()
You can use apply() to generate a column dynamically.
# Adding a column based on a function
def categorize_age(age):
return 'Young' if age < 30 else 'Old'
df['Age_Group'] = df['Age'].apply(categorize_age)
print(df)
📌 Adding a Column Using insert()
The insert() method allows adding a column at a specific position.
# Adding a column at index 1
df.insert(1, 'Gender', ['F', 'M', 'M', 'M', 'F'])
print(df)
❌ 3. Removing Columns
You can remove columns using different methods.
📌 Removing a Column Using drop()
The drop() method allows you to delete a column.
# Removing a single column
df = df.drop(columns=['Country'])
print(df)
📌 Removing Multiple Columns
You can pass a list of column names to drop().
# Removing multiple columns
df = df.drop(columns=['Age_Group', 'Years_to_Retire'])
print(df)
📌 Removing a Column Using del
The del keyword can also be used to remove a column.
# Removing a column using del
del df['Gender']
print(df)
📌 Removing a Column Using pop()
The pop() method removes a column and returns it.
# Removing a column using pop
salary_column = df.pop('Age')
print(df)
print("Removed column:", salary_column)
🎯 Conclusion
Adding and removing columns in a Pandas DataFrame is essential for data manipulation. You can add columns using assignment, insert(), and apply(). Removing columns can be done using drop(), del, or pop(). Mastering these techniques will help you efficiently manage your datasets in Python.
