w3resource

Pandas Practice Set-1: Read the diamonds DataFrame and detect duplicate color


Write a Pandas program to read the diamonds DataFrame and detect duplicate color.

Note: duplicated () function returns boolean Series denoting duplicate rows, optionally only considering certain columns.

Sample Solution:

Python Code:

import pandas as pd
diamonds = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/diamonds.csv')
print("Original Dataframe:")
print(diamonds.shape)
print("\nCount the duplicate items:")
print(diamonds.clarity.duplicated().sum())

Sample Output:

Original Dataframe:
(53940, 10)

Count the duplicate items:
53932

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Pandas program to get sample 75% of the diamonds DataFrame's rows without replacement and store the remaining 25% of the rows in another DataFrame.
Next: Write a Pandas program to count the duplicate rows of diamonds DataFrame.

What is the difficulty level of this exercise?



Follow us on Facebook and Twitter for latest update.