Pandas: Remove the duplicates of a specific column in a given dataframe
Write a Pandas program to remove the duplicates from 'WHO region' column of World alcohol consumption dataset.
Test Data:
Year WHO region Country Beverage Types Display Value 0 1986 Western Pacific Viet Nam Wine 0.00 1 1986 Americas Uruguay Other 0.50 2 1985 Africa Cte d'Ivoire Wine 1.62 3 1986 Americas Colombia Beer 4.27 4 1987 Americas Saint Kitts and Nevis Beer 1.98
Sample Solution:
Python Code :
import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
print("World alcohol consumption sample data:")
print(w_a_con.head())
print("\nAfter removing the duplicates of WHO region column:")
print(w_a_con.drop_duplicates('WHO region'))
Sample Output:
World alcohol consumption sample data: Year WHO region ... Beverage Types Display Value 0 1986 Western Pacific ... Wine 0.00 1 1986 Americas ... Other 0.50 2 1985 Africa ... Wine 1.62 3 1986 Americas ... Beer 4.27 4 1987 Americas ... Beer 1.98 [5 rows x 5 columns] After removing the duplicates of WHO region column: Year WHO region ... Beverage Types Display Value 0 1986 Western Pacific ... Wine 0.00 1 1986 Americas ... Other 0.50 2 1985 Africa ... Wine 1.62 13 1984 Eastern Mediterranean ... Other 0.00 18 1984 Europe ... Spirits 1.62 20 1986 South-East Asia ... Wine 0.00 [6 rows x 5 columns]
Click to download world_alcohol.csv
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous:Write a Pandas program to find and drop the missing values from World alcohol consumption dataset.
Next: Write a Pandas program to find out the alcohol consumption of a given year from the world alcohol consumption dataset.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics