w3resource

Pandas: Remove the duplicates of a specific column in a given dataframe


Write a Pandas program to remove the duplicates from 'WHO region' column of World alcohol consumption dataset.

Test Data:

   Year       WHO region                Country Beverage Types  Display Value
0  1986  Western Pacific               Viet Nam           Wine           0.00
1  1986         Americas                Uruguay          Other           0.50
2  1985           Africa           Cte d'Ivoire           Wine           1.62
3  1986         Americas               Colombia           Beer           4.27
4  1987         Americas  Saint Kitts and Nevis           Beer           1.98   

Sample Solution:

Python Code :

import pandas as pd
# World alcohol consumption data
w_a_con = pd.read_csv('world_alcohol.csv')
print("World alcohol consumption sample data:")
print(w_a_con.head())

print("\nAfter removing the duplicates of WHO region column:")
print(w_a_con.drop_duplicates('WHO region'))

Sample Output:

World alcohol consumption sample data:
   Year       WHO region      ...      Beverage Types Display Value
0  1986  Western Pacific      ...                Wine          0.00
1  1986         Americas      ...               Other          0.50
2  1985           Africa      ...                Wine          1.62
3  1986         Americas      ...                Beer          4.27
4  1987         Americas      ...                Beer          1.98

[5 rows x 5 columns]

After removing the duplicates of WHO region column:
    Year             WHO region      ...      Beverage Types Display Value
0   1986        Western Pacific      ...                Wine          0.00
1   1986               Americas      ...               Other          0.50
2   1985                 Africa      ...                Wine          1.62
13  1984  Eastern Mediterranean      ...               Other          0.00
18  1984                 Europe      ...             Spirits          1.62
20  1986        South-East Asia      ...                Wine          0.00

[6 rows x 5 columns]

Click to download world_alcohol.csv

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous:Write a Pandas program to find and drop the missing values from World alcohol consumption dataset.
Next: Write a Pandas program to find out the alcohol consumption of a given year from the world alcohol consumption dataset.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.