w3resource

Pandas: Split a given DataFrame into two random subsets


Write a Pandas program to split a given DataFrame into two random subsets.

Sample Solution :

Python Code :

import pandas as pd
df = pd.DataFrame({
    'name': ['Alberto Franco','Gino Mcneill','Ryan Parkes', 'Eesha Hinton', 'Syed Wharton'],
    'date_of_birth': ['17/05/2002','16/02/1999','25/09/1998','11/05/2002','15/09/1997'],
    'age': ['18', '21', '22', '22', '23']
})

df_1 = df.sample(frac = 0.6)
df_2 = df.drop(df_1.index)
print("Original Dataframe and shape:")
print(df)
print(df.shape)
print("\nSubset-1 and shape:")
print(df_1)
print(df_1.shape)
print("\nSubset-2 and shape:")
print(df_2)
print(df_2.shape)

Sample Output:

Original Dataframe and shape:
             name date_of_birth age
0  Alberto Franco    17/05/2002  18
1    Gino Mcneill    16/02/1999  21
2     Ryan Parkes    25/09/1998  22
3    Eesha Hinton    11/05/2002  22
4    Syed Wharton    15/09/1997  23
(5, 3)

Subset-1 and shape:
           name date_of_birth age
1  Gino Mcneill    16/02/1999  21
4  Syed Wharton    15/09/1997  23
2   Ryan Parkes    25/09/1998  22
(3, 3)

Subset-2 and shape:
             name date_of_birth age
0  Alberto Franco    17/05/2002  18
3    Eesha Hinton    11/05/2002  22
(2, 3)

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to select columns by data type of a given DataFrame.
Next: Write a Pandas program to rename all columns with the same pattern of a given DataFrame.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.