Merging DataFrames and removing duplicate rows in Pandas
Pandas: Custom Function Exercise-16 with Solution
Write a Pandas program to merge DataFrames and drop duplicates.
In this exercise, we have merged two DataFrames and then remove any duplicate rows that may arise from the merge.
Sample Solution :
Code :
import pandas as pd
# Create two sample DataFrames with potential duplicates
df1 = pd.DataFrame({
'ID': [1, 2, 3],
'Name': ['Annabel', 'Selena', 'Caeso']
})
df2 = pd.DataFrame({
'ID': [2, 3, 1],
'Name': ['Selena', 'Caeso', 'Annabel'],
'Age': [30, 22, 25]
})
# Merge the DataFrames on the 'ID' column
merged_df = pd.merge(df1, df2, on=['ID', 'Name'])
# Drop any duplicate rows from the merged DataFrame
merged_df_no_duplicates = merged_df.drop_duplicates()
# Output the result
print(merged_df_no_duplicates)
Output:
ID Name Age 0 1 Annabel 25 1 2 Selena 30 2 3 Caeso 22
Explanation:
- Created two DataFrames df1 and df2 with overlapping data.
- Merged the DataFrames on the 'ID' and 'Name' columns.
- Removed any duplicate rows in the merged DataFrame using drop_duplicates().
Python-Pandas Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python-exercises/pandas/pandas-merge-dataframes-and-remove-duplicate-rows.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics