- Published on
Pandas DataFrame Attributes: A Comprehensive Guide with Examples
- Authors
- Name
- Chris Fitzgerald
Pandas DataFrame Attributes: A Comprehensive Guide with Examples
Pandas is a popular Python library for data manipulation and analysis. One of its key data structures is the DataFrame, which you can think of as an in-memory 2D table (like a spreadsheet), with labeled axes (rows and columns). DataFrames are incredibly versatile and efficient for data tasks, making them a staple in data science and software engineering projects.
For our examples, let's assume we're dealing with a dataset of Magic: The Gathering cards. Each card has various attributes such as Name
, Type
, Rarity
, and ManaCost
.
import pandas as pd
# Sample Magic the Gathering DataFrame
data = {
'Name': ['Black Lotus', 'Time Walk', 'Ancestral Recall'],
'Type': ['Artifact', 'Sorcery', 'Instant'],
'Rarity': ['Rare', 'Rare', 'Rare'],
'ManaCost': [0, 2, 1]
}
mtg_df = pd.DataFrame(data)
Now, let's explore the various DataFrame attributes.
Common DataFrame Attributes
df.shape
1. Returns a tuple representing the dimensions of the DataFrame.
print(mtg_df.shape) # Output: (3, 4)
df.index
2. The index (row labels) of the DataFrame.
print(mtg_df.index) # Output: RangeIndex(start=0, stop=3, step=1)
df.columns
3. The column labels of the DataFrame.
print(mtg_df.columns) # Output: Index(['Name', 'Type', 'Rarity', 'ManaCost'], dtype='object')
df.dtypes
4. Data types of each column.
print(mtg_df.dtypes)
# Output:
# Name object
# Type object
# Rarity object
# ManaCost int64
# dtype: object
df.size
5. Total number of elements.
print(mtg_df.size) # Output: 12
df.values
6. Numpy representation of the DataFrame.
print(mtg_df.values)
# Output:
# [['Black Lotus' 'Artifact' 'Rare' 0]
# ['Time Walk' 'Sorcery' 'Rare' 2]
# ['Ancestral Recall' 'Instant' 'Rare' 1]]
df.T
7. Transposes rows and columns.
print(mtg_df.T)
# Output:
# 0 1 2
# Name Black Lotus Time Walk Ancestral Recall
# Type Artifact Sorcery Instant
# Rarity Rare Rare Rare
# ManaCost 0 2 1
df.empty
8. Boolean value indicating whether the DataFrame is empty.
print(mtg_df.empty) # Output: False
df.ndim
9. Number of dimensions. For a DataFrame, this will always be 2.
print(mtg_df.ndim) # Output: 2
df.memory_usage()
10. Memory usage of each column.
print(mtg_df.memory_usage())
# Output:
# Index 128
# Name 24
# Type 24
# Rarity 24
# ManaCost 24
# dtype: int64
df.at
, df.iat
11. Used for quick access to a single element.
print(mtg_df.at[0, 'Name']) # Output: 'Black Lotus'
print(mtg_df.iat[0, 0]) # Output: 'Black Lotus'
df.axes
12. Returns a list representing the axes of the DataFrame.
print(mtg_df.axes)
# Output: [RangeIndex(start=0, stop=3, step=1), Index(['Name', 'Type', 'Rarity', 'ManaCost'], dtype='object')]
These are some of the most commonly used DataFrame attributes. Pandas offers a plethora of functions and attributes to manipulate and explore data, making it an invaluable tool for anyone diving into data science or software engineering projects involving data manipulation and analysis.