Pandas Data Series: Display most frequent value in a given series and replace everything else as ‘Other’ in the series
20. Replace with Most Frequent
Write a Pandas program to display most frequent value in a given series and replace everything else as ‘Other’ in the series.
Sample Solution :
Python Code :
import pandas as pd
import numpy as np
num_series = pd.Series(np.random.randint(1, 5, [15]))
print("Original Series:")
print(num_series)
print("Top 2 Freq:", num_series.value_counts())
result = num_series[~num_series.isin(num_series.value_counts().index[:1])] = 'Other'
print(num_series)
Sample Output:
Original Series: 0 3 1 1 2 1 3 3 4 2 5 2 6 1 7 2 8 3 9 1 10 2 11 2 12 2 13 3 14 3 dtype: int64 Top 2 Freq: 2 6 3 5 1 4 dtype: int64 0 Other 1 Other 2 Other 3 Other 4 2 5 2 6 Other 7 2 8 Other 9 Other 10 2 11 2 12 2 13 Other 14 Other dtype: object
Explanation:
num_series = pd.Series(np.random.randint(1, 5, [15])): This line creates a Pandas Series object 'num_series' containing 15 random integers between 1 and 5 using the np.random.randint() method.
result = num_series[~num_series.isin(num_series.value_counts().index[:1])] = 'Other': This line creates a boolean mask using the .isin() method to check which values in the Pandas Series object 'num_series' are equal to the most frequent value in the Series object, which is obtained using the .value_counts() method and selecting the first index using the slicing notation [:1]. The tilde (~) operator is used to negate the boolean mask, resulting in a mask that selects all values that are not equal to the most frequent value.
For more Practice: Solve these Related Problems:
- Write a Pandas program to identify the most frequent value in a Series and replace all other values with 'Other'.
- Write a Pandas program to replace values in a Series with 'Other' if they do not belong to the top two frequent values.
- Write a Pandas program to display the most frequent element of a Series and mask remaining elements as NaN.
- Write a Pandas program to filter a Series to show only the top frequent values and replace others with a custom label.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to calculate the frequency counts of each unique value of a given series.
Next: Write a Pandas program to find the positions of numbers that are multiples of 5 of a given series.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.