Performance comparison of cumulative Sum calculation in Pandas
Pandas: Performance Optimization Exercise-14 with Solution
Write a Pandas program to compare the performance of calculating the cumulative sum of a column using the “cumsum” method vs. using a "for" loop.
Sample Solution :
Python Code :
# Import necessary libraries
import pandas as pd
import numpy as np
import time
# Create a sample DataFrame
num_rows = 1000000
df = pd.DataFrame({'value': np.random.randn(num_rows)})
# Measure time for cumsum method
start_time = time.time()
cumsum_result = df['value'].cumsum()
end_time = time.time()
cumsum_time = end_time - start_time
# Measure time for for loop method
start_time = time.time()
cumsum_for_loop = np.zeros(num_rows)
cumsum_for_loop[0] = df['value'].iloc[0]
for i in range(1, num_rows):
cumsum_for_loop[i] = cumsum_for_loop[i-1] + df['value'].iloc[i]
end_time = time.time()
for_loop_time = end_time - start_time
# Print the time taken for each method
print(f"Time taken using cumsum method: {cumsum_time:.6f} seconds")
print(f"Time taken using for loop: {for_loop_time:.6f} seconds")
Output:
Time taken using cumsum method: 0.006079 seconds Time taken using for loop: 8.145854 seconds
Explanation:
- Import Libraries:
- Import pandas, numpy, and time.
- Create DataFrame:
- Generate a sample DataFrame with 1,000,000 rows.
- Time Measurement for cumsum Method:
- Measure the time taken to calculate the cumulative sum using the cumsum method.
- Time Measurement for for Loop:
- Measure the time taken to calculate the cumulative sum using a for loop.
- Print Results:
- Print the time taken for each method.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Performance comparison of Resampling time Series data in Pandas.
Next: Optimize string operations in Pandas: str accessor vs. apply.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python-exercises/pandas/performance-comparison-of-cumulative-sum-calculation-in-pandas.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics