Filtering DataFrame rows by column values in Pandas using NumPy array
Extract rows from a Pandas DataFrame where a specific column's values are in a given NumPy array.
Sample Solution:
Python Code:
import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {'Name': ['Teodosija', 'Sutton', 'Taneli', 'David', 'Emily'],
'Age': [25, 30, 22, 35, 28],
'Salary': [50000, 60000, 45000, 70000, 55000]}
df = pd.DataFrame(data)
# Define a NumPy array with values to filter by
selected_age_values = np.array([25, 35])
# Extract rows where 'Age' column values are in the NumPy array
selected_rows = df[df['Age'].isin(selected_age_values)]
# Display the selected rows
print(selected_rows)
Output:
Name Age Salary 0 Teodosija 25 50000 3 David 35 70000
Explanation:
In the exerciser above -
- First create a sample DataFrame (df) with columns 'Name', 'Age', and 'Salary'.
- We define a NumPy array selected_age_values containing the values we want to filter by in the 'Age' column.
- The df['Age'].isin(selected_age_values) condition creates a boolean Series, and boolean indexing is used to extract rows where the condition is True.
- The resulting DataFrame (selected_rows) contains only rows where the 'Age' column values are in the specified NumPy array.
Flowchart:
Python Code Editor:
Previous: Merging DataFrames based on a common column in Pandas.
Next: Performing element-wise addition in Pandas DataFrame with NumPy array.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics