Pandas Practice Set-1: Calculate various summary statistics of cut series of diamonds DataFrame
Write a Pandas program to calculate various summary statistics of cut series of diamonds DataFrame.
Sample Solution:
Python Code:
import pandas as pd
diamonds = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/diamonds.csv')
print("Original Dataframe:")
print(diamonds.head())
print("\nVarious summary statistics of diamonds DataFrame:")
print(diamonds.carat.describe())
Sample Output:
Original Dataframe: carat cut color clarity depth table price x y z 0 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43 1 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31 2 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31 3 0.29 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63 4 0.31 Good J SI2 63.3 58.0 335 4.34 4.35 2.75 Various summary statistics of diamonds DataFrame: count 53940.000000 mean 0.797940 std 0.474011 min 0.200000 25% 0.400000 50% 0.700000 75% 1.040000 max 5.010000 Name: carat, dtype: float64
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Pandas program to compute a cross-tabulation of two Series in diamonds DataFrame.
Next: Write a Pandas program to create a histogram of the 'carat' Series (distribution of a numerical variable) of diamonds DataFrame.
What is the difficulty level of this exercise?
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics