w3resource

Pandas DataFrame: Shuffle a given DataFrame rows


Write a Pandas program to shuffle a given DataFrame rows.
Sample data:
Original DataFrame:
attempts name qualify score
0 1 Anastasia yes 12.5
1 3 Dima no 9.0
2 2 Katherine yes 16.5
3 3 James no NaN
4 2 Emily no 9.0
5 3 Michael yes 20.0
6 1 Matthew yes 14.5
7 1 Laura no NaN
8 2 Kevin no 8.0
9 1 Jonas yes 19.0
New DataFrame:
attempts name qualify score
5 3 Michael yes 20.0
0 1 Anastasia yes 12.5
9 1 Jonas yes 19.0
6 1 Matthew yes 14.5
7 1 Laura no NaN
1 3 Dima no 9.0
3 3 James no NaN
4 2 Emily no 9.0
8 2 Kevin no 8.0
2 2 Katherine yes 16.5

Sample Solution :

Python Code :

import pandas as pd
import numpy as np
exam_data = {'name': ['Anastasia', 'Dima', 'Katherine', 'James', 'Emily', 'Michael', 'Matthew', 'Laura', 'Kevin', 'Jonas'],
        'score': [12.5, 9, 16.5, np.nan, 9, 20, 14.5, np.nan, 8, 19],
        'attempts': [1, 3, 2, 3, 2, 3, 1, 1, 2, 1],
        'qualify': ['yes', 'no', 'yes', 'no', 'no', 'yes', 'yes', 'no', 'no', 'yes']}
df = pd.DataFrame(exam_data)
print("Original DataFrame:")
print(df)
df = df.sample(frac=1)
print("\nNew DataFrame:")
print(df)

Sample Output:

Original DataFrame:
   attempts       name qualify  score
0         1  Anastasia     yes   12.5
1         3       Dima      no    9.0
2         2  Katherine     yes   16.5
3         3      James      no    NaN
4         2      Emily      no    9.0
5         3    Michael     yes   20.0
6         1    Matthew     yes   14.5
7         1      Laura      no    NaN
8         2      Kevin      no    8.0
9         1      Jonas     yes   19.0

New DataFrame:
   attempts       name qualify  score
5         3    Michael     yes   20.0
0         1  Anastasia     yes   12.5
9         1      Jonas     yes   19.0
6         1    Matthew     yes   14.5
7         1      Laura      no    NaN
1         3       Dima      no    9.0
3         3      James      no    NaN
4         2      Emily      no    9.0
8         2      Kevin      no    8.0
2         2  Katherine     yes   16.5               

Explanation:

The said code first creates a Pandas DataFrame df from the dictionary ‘exam_data’. The DataFrame contains information about students' names, scores, number of attempts and whether they qualify or not.

df = df.sample(frac=1): This code shuffles the rows of the Pandas DataFrame df randomly using the sample method with frac=1, which means to sample all rows. It essentially reorders the rows of the DataFrame randomly.

The original DataFrame is ‘exam_data’. The DataFrame has 4 columns, namely name, score, attempts, and qualify. Each column has 10 elements. The sample method is used to shuffle the rows of this DataFrame in a random order.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to combining two series into a DataFrame.
Next: Write a Pandas program to convert DataFrame column type from string to datetime.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.