w3resource

Pandas: Extract words starting with capital words from a given column of a given DataFrame


40. Extract Words Starting with Capital Letters

Write a Pandas program to extract words starting with capital words from a given column of a given DataFrame.

Sample Solution:

Python Code :

import pandas as pd
import re as re
df = pd.DataFrame({
    'company_code': ['Abcd','EFGF', 'zefsalf', 'sdfslew', 'zekfsdf'],
    'date_of_sale': ['12/05/2002','16/02/1999','05/09/1998','12/02/2022','15/09/1997'],
    'address': ['9910 Surrey Avenue','92 N. Bishop Avenue','9910 Golden Star Avenue', '102 Dunbar St.', '17 West Livingston Court']
})

print("Original DataFrame:")
print(df)

def find_capital_word(str1):
    result = re.findall(r'\b[A-Z]\w+', str1)
    return result

df['caps_word_in']=df['address'].apply(lambda cw : find_capital_word(cw))
print("\nExtract words starting with capital words from the sentences':")
print(df)

Sample Output:

Original DataFrame:
  company_code date_of_sale                   address
0         Abcd   12/05/2002        9910 Surrey Avenue
1         EFGF   16/02/1999       92 N. Bishop Avenue
2      zefsalf   05/09/1998   9910 Golden Star Avenue
3      sdfslew   12/02/2022            102 Dunbar St.
4      zekfsdf   15/09/1997  17 West Livingston Court

Extract words starting with capital words from the sentences':
  company_code            ...                           caps_word_in
0         Abcd            ...                       [Surrey, Avenue]
1         EFGF            ...                       [Bishop, Avenue]
2      zefsalf            ...                 [Golden, Star, Avenue]
3      sdfslew            ...                           [Dunbar, St]
4      zekfsdf            ...              [West, Livingston, Court]

[5 rows x 4 columns]

For more Practice: Solve these Related Problems:

  • Write a Pandas program to extract words that start with a capital letter from a DataFrame column using regex and then output them as a list.
  • Write a Pandas program to filter a text column for words beginning with uppercase letters and then join them into a new column.
  • Write a Pandas program to create a new series containing only the words starting with a capital letter from each row of a column.
  • Write a Pandas program to extract and count the occurrence of capitalized words in a column and then display the frequency distribution.

Go to:


Previous: Write a Pandas program to extract the unique sentences from a given column of a given DataFrame.
Next: Write a Pandas program to remove the html tags within the specified column of a given DataFrame.

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.