Filtering rows based on a column condition in Pandas DataFrame

Last update on December 21 2024 07:43:29 (UTC/GMT +8 hours)

Filter rows based on a condition in a specific column in a Pandas DataFrame.

Sample Solution:

Python Code:

import pandas as pd

# Create a sample DataFrame
data = {'Name': ['Teodosija', 'Sutton', 'Taneli', 'Ravshan', 'Ross', 'Alice', 'Bob', 'Charlie', 'David', 'Emily'],
        'Age': [26, 32, 25, 31, 28, 22, 35, 30, 40, 28],
        'Salary': [50000, 60000, 45000, 70000, 55000, 60000, 70000, 55000, 75000, 65000]}

df = pd.DataFrame(data)

# Filter rows where Age is greater than 30
filtered_rows = df[df['Age'] > 30]

# Display the filtered rows
print(filtered_rows)

Output:

      Name  Age  Salary
1   Sutton   32   60000
3  Ravshan   31   70000
6      Bob   35   70000
8    David   40   75000

Explanation:

In the exerciser above -

Create a sample DataFrame (df) with columns 'Name', 'Age', and 'Salary'.
The condition df['Age'] > 30 creates a boolean Series with True for rows where the age is greater than 30 and False otherwise.
The boolean indexing df[df['Age'] > 30] is used to select only rows where the condition is True.
The resulting DataFrame (filtered_rows) contains only rows where the age is greater than 30.
Finally, print the filtered rows to the console.

Flowchart:

Python Code Editor:

Previous: Selecting the first and last 7 rows in a Pandas DataFrame.
Next: Creating a new column with NumPy operation in Pandas DataFrame.