Performance comparison of DataFrame sorting in Pandas
17. Sorting Performance: sort_values() vs. Custom apply()
Write a Pandas program to measure the time taken to sort a large DataFrame using the sort_values method vs. using a custom sorting function with apply.
Sample Solution :
Python Code :
Output:
Time taken using sort_values method: 0.075608 seconds Time taken using apply method: 0.075997 seconds
Explanation:
- Import Libraries:
- Import pandas, numpy, and time.
- Create DataFrame:
- Generate a sample DataFrame with 1,000,000 rows with columns 'A' and 'B'.
- Time Measurement for sort_values Method:
- Measure the time taken to sort the DataFrame using the sort_values method.
- Define Custom Sorting Function:
- Define a custom function for sorting.
- Time Measurement for apply Method:
- Measure the time taken to sort the DataFrame using the custom sorting function with the apply method.
- Finally print the time taken for each method.
For more Practice: Solve these Related Problems:
- Write a Pandas program to sort a large DataFrame using sort_values() and measure the execution time.
- Write a Pandas program to sort a DataFrame by a custom key using apply() and compare the performance with sort_values().
- Write a Pandas program to benchmark the sorting of a DataFrame using the built-in sort_values() versus a custom sorting function.
- Write a Pandas program to analyze the time taken for DataFrame sorting using vectorized sort_values() compared to iterative sorting.
Go to:
Previous: Reshaping DataFrame in Pandas: pivot_table vs. manual Loop.
Next: Rolling Window Calculation in Pandas: rolling vs. Manual.
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.