Calculating cumulative sum in Pandas DataFrame with NumPy array
Calculate the cumulative sum of a NumPy array and store the results in a new Pandas DataFrame column.
Sample Solution:
Python Code:
import pandas as pd
import numpy as np
# Create a sample DataFrame
data = {'Values': [100, 200, 300, 400, 500]}
df = pd.DataFrame(data)
# Create a NumPy array from the 'Values' column
numpy_array = np.array(df['Values'])
# Calculate the cumulative sum and store in a new column 'Cumulative_Sum'
df['Cumulative_Sum'] = np.cumsum(numpy_array)
# Display the updated DataFrame
print(df)
Output:
Values Cumulative_Sum 0 100 100 1 200 300 2 300 600 3 400 1000 4 500 1500
Explanation:
In the exerciser above -
- First we create a sample DataFrame (df) with a column 'Values'.
- Next we convert the 'Values' column to a NumPy array using np.array().
- The np.cumsum(numpy_array) function calculates the cumulative sum of the NumPy array.
- The result is assigned to a new column 'Cumulative_Sum' in the DataFrame.
- The updated DataFrame is then printed to the console.
Flowchart:
Python Code Editor:
Previous: Calculating correlation matrix for DataFrame in Python.
Next: Grouping DataFrame by column and calculating mean in Python.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics