Pandas DataFrame Attributes: A Comprehensive Guide with Examples

Pandas is a popular Python library for data manipulation and analysis. One of its key data structures is the DataFrame, which you can think of as an in-memory 2D table (like a spreadsheet), with labeled axes (rows and columns). DataFrames are incredibly versatile and efficient for data tasks, making them a staple in data science and software engineering projects.

For our examples, let's assume we're dealing with a dataset of Magic: The Gathering cards. Each card has various attributes such as Name, Type, Rarity, and ManaCost.

import pandas as pd

# Sample Magic the Gathering DataFrame
data = {
    'Name': ['Black Lotus', 'Time Walk', 'Ancestral Recall'],
    'Type': ['Artifact', 'Sorcery', 'Instant'],
    'Rarity': ['Rare', 'Rare', 'Rare'],
    'ManaCost': [0, 2, 1]
}

mtg_df = pd.DataFrame(data)

Now, let's explore the various DataFrame attributes.

Common DataFrame Attributes

1. `df.shape`

Returns a tuple representing the dimensions of the DataFrame.

print(mtg_df.shape)  # Output: (3, 4)

2. `df.index`

The index (row labels) of the DataFrame.

print(mtg_df.index)  # Output: RangeIndex(start=0, stop=3, step=1)

3. `df.columns`

The column labels of the DataFrame.

print(mtg_df.columns)  # Output: Index(['Name', 'Type', 'Rarity', 'ManaCost'], dtype='object')

4. `df.dtypes`

Data types of each column.

print(mtg_df.dtypes)
# Output:
# Name        object
# Type        object
# Rarity      object
# ManaCost     int64
# dtype: object

5. `df.size`

Total number of elements.

print(mtg_df.size)  # Output: 12

6. `df.values`

Numpy representation of the DataFrame.

print(mtg_df.values)
# Output:
# [['Black Lotus' 'Artifact' 'Rare' 0]
#  ['Time Walk' 'Sorcery' 'Rare' 2]
#  ['Ancestral Recall' 'Instant' 'Rare' 1]]

7. `df.T`

Transposes rows and columns.

print(mtg_df.T)
# Output:
#             0          1                2
# Name     Black Lotus  Time Walk  Ancestral Recall
# Type      Artifact    Sorcery          Instant
# Rarity       Rare       Rare             Rare
# ManaCost        0          2                1

8. `df.empty`

Boolean value indicating whether the DataFrame is empty.

print(mtg_df.empty)  # Output: False

9. `df.ndim`

Number of dimensions. For a DataFrame, this will always be 2.

print(mtg_df.ndim)  # Output: 2

10. `df.memory_usage()`

Memory usage of each column.

print(mtg_df.memory_usage())
# Output:
# Index       128
# Name         24
# Type         24
# Rarity       24
# ManaCost     24
# dtype: int64

11. `df.at`, `df.iat`

Used for quick access to a single element.

print(mtg_df.at[0, 'Name'])  # Output: 'Black Lotus'
print(mtg_df.iat[0, 0])  # Output: 'Black Lotus'

12. `df.axes`

Returns a list representing the axes of the DataFrame.

print(mtg_df.axes)
# Output: [RangeIndex(start=0, stop=3, step=1), Index(['Name', 'Type', 'Rarity', 'ManaCost'], dtype='object')]

These are some of the most commonly used DataFrame attributes. Pandas offers a plethora of functions and attributes to manipulate and explore data, making it an invaluable tool for anyone diving into data science or software engineering projects involving data manipulation and analysis.

Pandas DataFrame Attributes: A Comprehensive Guide with Examples

Common DataFrame Attributes

1. df.shape

2. df.index

3. df.columns

4. df.dtypes

5. df.size

6. df.values

7. df.T

8. df.empty

9. df.ndim

10. df.memory_usage()

11. df.at, df.iat

12. df.axes