Pandas: Divide a DataFrame in a given ratio

Last update on September 05 2025 12:40:02 (UTC/GMT +8 hours)

38. Divide DataFrame by Ratio

Write a Pandas program to divide a DataFrame in a given ratio.

Sample data:
Original DataFrame:
0 1
0 0.316147 -0.767359
1 -0.813410 -2.522672
2 0.869615 1.194704
3 -0.892915 -0.055133
4 -0.341126 0.518266
5 1.857342 1.361229
6 -0.044353 -1.205002
7 -0.726346 -0.535147
8 -1.350726 0.563117
9 1.051666 -0.441533
70% of the said DataFrame:
0 1
8 -1.350726 0.563117
2 0.869615 1.194704
5 1.857342 1.361229
6 -0.044353 -1.205002
3 -0.892915 -0.055133
1 -0.813410 -2.522672
0 0.316147 -0.767359
30% of the said DataFrame:
0 1
4 -0.341126 0.518266
7 -0.726346 -0.535147
9 1.051666 -0.441533

Sample Solution :

Python Code :

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 2))
print("Original DataFrame:")
print(df)
part_70 = df.sample(frac=0.7,random_state=10)
part_30 = df.drop(part_70.index)
print("\n70% of the said DataFrame:")
print(part_70)
print("\n30% of the said DataFrame:")
print(part_30)

Sample Output:

Original DataFrame:
          0         1
0  0.316147 -0.767359
1 -0.813410 -2.522672
2  0.869615  1.194704
3 -0.892915 -0.055133
4 -0.341126  0.518266
5  1.857342  1.361229
6 -0.044353 -1.205002
7 -0.726346 -0.535147
8 -1.350726  0.563117
9  1.051666 -0.441533

70% of the said DataFrame:
          0         1
8 -1.350726  0.563117
2  0.869615  1.194704
5  1.857342  1.361229
6 -0.044353 -1.205002
3 -0.892915 -0.055133
1 -0.813410 -2.522672
0  0.316147 -0.767359

30% of the said DataFrame:
          0         1
4 -0.341126  0.518266
7 -0.726346 -0.535147
9  1.051666 -0.441533

Explanation:

The above code first generates a Pandas DataFrame df with 10 rows and 2 columns filled with random numbers using NumPy.

part_70 = df.sample(frac=0.7,random_state=10): This code creates a new DataFrame 'part_70' by sampling 70% of the rows from 'df' using the sample method. The 'frac' parameter specifies the fraction of the rows to be sampled, while the random_state parameter is used to ensure that the same set of rows is always sampled if the code is run again with the same random_state value.

part_30 = df.drop(part_70.index): This code creates another DataFrame 'part_30' by dropping the rows in ‘part_70’ from ‘df’. This is achieved by calling the drop method on ‘df’ with the indices of the rows to be dropped, which are obtained by calling the index attribute on ‘part_70’. The resulting DataFrame ‘part_30’ contains the remaining 30% of the rows from df.

For more Practice: Solve these Related Problems:

Write a Pandas program to split a DataFrame into two parts in a 70:30 ratio and then output the row counts for each part.
Write a Pandas program to randomly divide a DataFrame into two subsets based on a given ratio and then reset their indices.
Write a Pandas program to partition a DataFrame into training and testing sets using a specified ratio and then verify the split.
Write a Pandas program to divide a DataFrame by a given ratio and then export each partition to a separate CSV file.

Go to:

PREV : Reset DataFrame Index.
NEXT : Combine Two Series.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.