w3resource

Pandas: Groupby and aggregate over multiple lists


Write a Pandas program to split the following dataset using group by on first column and aggregate over multiple lists on second column.

Test Data:

  student_id         marks
0       S001  [88, 89, 90]
1       S001  [78, 81, 60]
2       S002  [84, 83, 91]
3       S002  [84, 88, 91]
4       S003  [90, 89, 92]
5       S003  [88, 59, 90]  

Sample Solution:

Python Code :

import pandas as pd
import numpy as np
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df = pd.DataFrame({
    'student_id': ['S001','S001','S002','S002','S003','S003'],
    'marks': [[88,89,90],[78,81,60],[84,83,91],[84,88,91],[90,89,92],[88,59,90]]})
print("Original DataFrame:")
print(df)
print("\nGroupby and aggregate over multiple lists:")
result = df.set_index('student_id')['marks'].groupby('student_id').apply(list).apply(lambda x: np.mean(x,0))
print(result)

Sample Output:

Original DataFrame:
  student_id         marks
0       S001  [88, 89, 90]
1       S001  [78, 81, 60]
2       S002  [84, 83, 91]
3       S002  [84, 88, 91]
4       S003  [90, 89, 92]
5       S003  [88, 59, 90]

Groupby and aggregate over multiple lists:
student_id
S001    [83.0, 85.0, 75.0]
S002    [84.0, 85.5, 91.0]
S003    [89.0, 74.0, 91.0]
Name: marks, dtype: object

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to split a given dataset using group by on specified column into two labels and ranges.
Next: Write a Pandas program to split the following dataset using group by on ‘salesman_id’ and find the first order date for each group.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.