Pandas: Remove the html tags within the specified column of a given DataFrame

Last update on September 09 2025 12:40:21 (UTC/GMT +8 hours)

41. Remove Tags from Column

Write a Pandas program to remove the html tags within the specified column of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
df = pd.DataFrame({
    'company_code': ['Abcd','EFGF', 'zefsalf', 'sdfslew', 'zekfsdf'],
    'date_of_sale': ['12/05/2002','16/02/1999','05/09/1998','12/02/2022','15/09/1997'],
    'address': ['9910 Surrey <b>Avenue</b>','92 N. Bishop Avenue','9910 <br>Golden Star Avenue', '102 Dunbar <i></i>St.', '17 West Livingston Court']
})
print("Original DataFrame:")
print(df)
def remove_tags(string):
    result = re.sub('<.*?>','',string)
    return result
df['with_out_tags']=df['address'].apply(lambda cw : remove_tags(cw))
print("\nSentences without tags':")
print(df)

Sample Output:

Original DataFrame:
  company_code             ...                                   address
0         Abcd             ...                 9910 Surrey Avenue
1         EFGF             ...                       92 N. Bishop Avenue
2      zefsalf             ...               9910 
Golden Star Avenue
3      sdfslew             ...                     102 Dunbar St.
4      zekfsdf             ...                  17 West Livingston Court

[5 rows x 3 columns]

Sentences without tags':
  company_code            ...                        with_out_tags
0         Abcd            ...                   9910 Surrey Avenue
1         EFGF            ...                  92 N. Bishop Avenue
2      zefsalf            ...              9910 Golden Star Avenue
3      sdfslew            ...                       102 Dunbar St.
4      zekfsdf            ...             17 West Livingston Court

[5 rows x 4 columns]

For more Practice: Solve these Related Problems:

Write a Pandas program to remove tags from a string column using regex and then output the cleaned text.
Write a Pandas program to strip all elements from a specified column and then verify the absence of any tags.
Write a Pandas program to clean a DataFrame column by removing markup and then create a new column with plain text.
Write a Pandas program to apply a function that removes tags from text entries in a column and then output the modified DataFrame.

Go to:

PREV : Extract Words Starting with Capital Letters.
NEXT : Pandas Time Series.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.