📖 Introduction
Pandas is a powerful Python library for data analysis and manipulation. When working with DataFrames, adding and removing columns is a common task. This tutorial will guide you through various ways to add new columns and remove existing ones using Pandas.
🗂️ 1. Creating a Sample DataFrame
Before we dive into adding and removing columns, let’s create a sample DataFrame to work with.
import pandas as pd
# Creating a sample DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
'Age': [25, 30, 35, 40, 28],
'City': ['New York', 'Los Angeles', 'Chicago', 'Houston', 'Phoenix']
}
df = pd.DataFrame(data)
print(df)
✅ Output:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
3 David 40 Houston
4 Eve 28 Phoenix
Now, let’s explore how to add and remove columns in this DataFrame.
➕ 2. Adding Columns
There are multiple ways to add new columns to a Pandas DataFrame.
📌 Adding a Column with a Fixed Value
You can add a new column by assigning a constant value to it.
# Adding a new column with a fixed value
df['Country'] = 'USA'
print(df)
✅ Output:
Name Age City Country
0 Alice 25 New York USA
1 Bob 30 Los Angeles USA
2 Charlie 35 Chicago USA
3 David 40 Houston USA
4 Eve 28 Phoenix USA
📌 Adding a Column Based on Another Column
You can create a new column using existing data.
# Adding a column that calculates years to retirement
df['Years_to_Retire'] = 65 - df['Age']
print(df)
📌 Adding a Column Using apply()
You can use apply()
to generate a column dynamically.
# Adding a column based on a function
def categorize_age(age):
return 'Young' if age < 30 else 'Old'
df['Age_Group'] = df['Age'].apply(categorize_age)
print(df)
📌 Adding a Column Using insert()
The insert()
method allows adding a column at a specific position.
# Adding a column at index 1
df.insert(1, 'Gender', ['F', 'M', 'M', 'M', 'F'])
print(df)
❌ 3. Removing Columns
You can remove columns using different methods.
📌 Removing a Column Using drop()
The drop()
method allows you to delete a column.
# Removing a single column
df = df.drop(columns=['Country'])
print(df)
📌 Removing Multiple Columns
You can pass a list of column names to drop()
.
# Removing multiple columns
df = df.drop(columns=['Age_Group', 'Years_to_Retire'])
print(df)
📌 Removing a Column Using del
The del
keyword can also be used to remove a column.
# Removing a column using del
del df['Gender']
print(df)
📌 Removing a Column Using pop()
The pop()
method removes a column and returns it.
# Removing a column using pop
salary_column = df.pop('Age')
print(df)
print("Removed column:", salary_column)
🎯 Conclusion
Adding and removing columns in a Pandas DataFrame is essential for data manipulation. You can add columns using assignment, insert()
, and apply()
. Removing columns can be done using drop()
, del
, or pop()
. Mastering these techniques will help you efficiently manage your datasets in Python.