đź“– Introduction
Pandas is a powerful data analysis and manipulation library for Python. One of its core structures is the DataFrame, which is a two-dimensional, tabular data structure similar to a spreadsheet or SQL table. In this tutorial, we will explore how to create a Pandas DataFrame from different data sources such as dictionaries, CSV files, Excel files, and JSON.
🗂️ 1. Creating a DataFrame from a Dictionary
A dictionary in Python consists of key-value pairs, where keys represent column names, and values represent data. Here’s an example:
import pandas as pd
# Creating a dictionary
data = {
'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 35],
'City': ['New York', 'Los Angeles', 'Chicago']
}
# Creating DataFrame
df = pd.DataFrame(data)
print(df)
âś… Output:
Name Age City
0 Alice 25 New York
1 Bob 30 Los Angeles
2 Charlie 35 Chicago
đź“„ 2. Creating a DataFrame from a CSV File
CSV (Comma-Separated Values) files are commonly used to store tabular data. You can read a CSV file into a Pandas DataFrame using the read_csv
method.
📌 Sample CSV Data (data.csv
):
Name,Age,City
Alice,25,New York
Bob,30,Los Angeles
Charlie,35,Chicago
# Reading a CSV file
df = pd.read_csv('data.csv')
print(df.head()) # Display first five rows
Ensure that the
data.csv
file is present in the working directory or provide the full file path.
📊 3. Creating a DataFrame from an Excel File
Excel files are widely used in data analysis. Pandas provides read_excel
to read Excel files into a DataFrame.
📌 Sample Excel Data (data.xlsx
):
Name | Age | City |
---|---|---|
Alice | 25 | New York |
Bob | 30 | Los Angeles |
Charlie | 35 | Chicago |
# Reading an Excel file
df = pd.read_excel('data.xlsx', sheet_name='Sheet1')
print(df.head())
Make sure you have
openpyxl
orxlrd
installed to read Excel files:pip install openpyxl xlrd
.
đź”— 4. Creating a DataFrame from a JSON File
JSON (JavaScript Object Notation) is a lightweight data format that is widely used in web applications. You can create a DataFrame from a JSON file using read_json
.
📌 Sample JSON Data (data.json
):
[
{"Name": "Alice", "Age": 25, "City": "New York"},
{"Name": "Bob", "Age": 30, "City": "Los Angeles"},
{"Name": "Charlie", "Age": 35, "City": "Chicago"}
]
# Reading a JSON file
df = pd.read_json('data.json')
print(df.head())
JSON data should be properly formatted to be read correctly.
🎯 Conclusion
Pandas provides multiple ways to create a DataFrame from different data sources, making it a flexible tool for data analysis. Whether working with dictionaries, CSV, Excel, or JSON files, Pandas makes data manipulation easy and efficient. By mastering these techniques, you can seamlessly work with various data formats in Python.