Pandas: Select random number of rows, fraction of random rows
3. Random Sampling of Rows
Write a Pandas program to select random number of rows, fraction of random rows from World alcohol consumption dataset.
Test Data:
Year WHO region Country Beverage Types Display Value 0 1986 Western Pacific Viet Nam Wine 0.00 1 1986 Americas Uruguay Other 0.50 2 1985 Africa Cte d'Ivoire Wine 1.62 3 1986 Americas Colombia Beer 4.27 4 1987 Americas Saint Kitts and Nevis Beer 1.98
Sample Solution:
Python Code :
import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
print("World alcohol consumption sample data:")
print(w_a_con.head())
print("\nSelect random number of rows:")
print(w_a_con.sample(5))
print("\nSelect fraction of randome rows:")
print(w_a_con.sample(frac=0.02))
Sample Output:
World alcohol consumption sample data:
Year WHO region ... Beverage Types Display Value
0 1986 Western Pacific ... Wine 0.00
1 1986 Americas ... Other 0.50
2 1985 Africa ... Wine 1.62
3 1986 Americas ... Beer 4.27
4 1987 Americas ... Beer 1.98
[5 rows x 5 columns]
Select random number of rows:
Year WHO region ... Beverage Types Display Value
65 1989 Eastern Mediterranean ... Beer 0.00
81 1985 Europe ... Wine 2.54
36 1987 Eastern Mediterranean ... Beer 0.07
0 1986 Western Pacific ... Wine 0.00
77 1985 Africa ... Spirits 0.01
[5 rows x 5 columns]
Select fraction of randome rows:
Year WHO region ... Beverage Types Display Value
80 1985 Africa ... Other 0.84
65 1989 Eastern Mediterranean ... Beer 0.00
[2 rows x 5 columns]
Click to download world_alcohol.csv
For more Practice: Solve these Related Problems:
- Write a Pandas program to randomly select a fraction (e.g., 30%) of rows from the dataset with a fixed random seed.
- Write a Pandas program to perform a random row sampling that returns exactly five rows and then reset the index.
- Write a Pandas program to randomly sample rows with replacement and then identify duplicate indices in the result.
- Write a Pandas program to randomly select a specified number of rows and compare the sampled subset’s shape with the original dataset.
Go to:
PREV : Row and Column Slicing.
NEXT :
Missing Value Handling.
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
