w3resource

Pandas - Removing duplicate rows in a DataFrame using drop_duplicates()


Pandas: Data Cleaning and Preprocessing Exercise-4 with Solution


Write a Pandas program to remove duplicates rows from a DataFrame.

This exercise demonstrates how to remove duplicate rows from a DataFrame using drop_duplicates().

Sample Solution :

Code :

import pandas as pd

# Create a sample DataFrame with duplicate rows
df = pd.DataFrame({
    'Name': ['David', 'Annabel', 'Charlie', 'David'],
    'Age': [25, 30, 22, 25],
    'Salary': [50000, 60000, 70000, 50000]
})

# Remove duplicate rows from the DataFrame
df_no_duplicates = df.drop_duplicates()

# Output the result
print(df_no_duplicates)

Output:

      Name  Age  Salary
0    David   25   50000
1  Annabel   30   60000
2  Charlie   22   70000

Explanation:

  • Created a DataFrame with duplicate rows.
  • Used drop_duplicates() to remove duplicate rows from the DataFrame.
  • Returned the DataFrame without duplicates.

Python-Pandas Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.