Pandas: GroupBy with condition of two labels and ranges

Last update on May 02 2025 12:46:26 (UTC/GMT +8 hours)

29. Grouping by Salesman ID into Ranges

Write a Pandas program to split a given dataset using group by on specified column into two labels and ranges.

Split the group on 'salesman_id',
Ranges:
1) (5001...5006)
2) (5007..5012)

Test Data:

    salesman_id  sale_jan
0          5001    150.50
1          5002    270.65
2          5003     65.26
3          5004    110.50
4          5005    948.50
5          5006   2400.60
6          5007   1760.00
7          5008   2983.43
8          5009    480.40
9          5010   1250.45
10         5011     75.29
11         5012   1045.60

Sample Solution:

Python Code :

import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
'salesman_id': [5001,5002,5003,5004,5005,5006,5007,5008,5009,5010,5011,5012],
'sale_jan':[150.5, 270.65, 65.26, 110.5, 948.5, 2400.6, 1760, 2983.43, 480.4,  1250.45, 75.29,1045.6]})
print("Original Orders DataFrame:")
print(df)
result = df.groupby(pd.cut(df['salesman_id'], 
                  bins=[0,5006,np.inf],  
                  labels=['S1', 'S2']))['sale_jan'].sum().reset_index()
print("\nGroupBy with condition of  two labels and ranges:")
print(result)

Sample Output:

Original Orders DataFrame:
    salesman_id  sale_jan
0          5001    150.50
1          5002    270.65
2          5003     65.26
3          5004    110.50
4          5005    948.50
5          5006   2400.60
6          5007   1760.00
7          5008   2983.43
8          5009    480.40
9          5010   1250.45
10         5011     75.29
11         5012   1045.60

GroupBy with condition of  two labels and ranges:
  salesman_id  sale_jan
0          S1   3946.01
1          S2   7595.17

For more Practice: Solve these Related Problems:

Write a Pandas program to group a dataframe by salesman_id and then assign each salesman to a range label based on their id.
Write a Pandas program to split the dataframe into two groups based on whether salesman_id falls within (5001…5006) or (5007…5012).
Write a Pandas program to group the dataset by salesman_id and then create a new column indicating the range category.
Write a Pandas program to segment salesman_id values into two bins and then group by these bins to calculate summary statistics.

Go to:

Previous: Write a Pandas program to split a given dataset, group by one column and remove those groups if all the values of a specific columns are not available.
Next: Write a Pandas program to split the following dataset using group by on first column and aggregate over multiple lists on second column.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.