Pandas Data Series: Display most frequent value in a given series and replace everything else as ‘Other’ in the series
Write a Pandas program to display most frequent value in a given series and replace everything else as ‘Other’ in the series.
Sample Solution :
Python Code :
import pandas as pd
import numpy as np
num_series = pd.Series(np.random.randint(1, 5, [15]))
print("Original Series:")
print(num_series)
print("Top 2 Freq:", num_series.value_counts())
result = num_series[~num_series.isin(num_series.value_counts().index[:1])] = 'Other'
print(num_series)
Sample Output:
Original Series: 0 3 1 1 2 1 3 3 4 2 5 2 6 1 7 2 8 3 9 1 10 2 11 2 12 2 13 3 14 3 dtype: int64 Top 2 Freq: 2 6 3 5 1 4 dtype: int64 0 Other 1 Other 2 Other 3 Other 4 2 5 2 6 Other 7 2 8 Other 9 Other 10 2 11 2 12 2 13 Other 14 Other dtype: object
Explanation:
num_series = pd.Series(np.random.randint(1, 5, [15])): This line creates a Pandas Series object 'num_series' containing 15 random integers between 1 and 5 using the np.random.randint() method.
result = num_series[~num_series.isin(num_series.value_counts().index[:1])] = 'Other': This line creates a boolean mask using the .isin() method to check which values in the Pandas Series object 'num_series' are equal to the most frequent value in the Series object, which is obtained using the .value_counts() method and selecting the first index using the slicing notation [:1]. The tilde (~) operator is used to negate the boolean mask, resulting in a mask that selects all values that are not equal to the most frequent value.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to calculate the frequency counts of each unique value of a given series.
Next: Write a Pandas program to find the positions of numbers that are multiples of 5 of a given series.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics