NLTK corpus: Extract the last letter of all the labeled names and create a new array with the last letter of each name

Last update on December 21 2024 07:35:44 (UTC/GMT +8 hours)

Write a Python NLTK program to extract the last letter of all the labeled names and create a new array with the last letter of each name and the associated label.

Sample Solution:

Python Code :

from nltk.corpus import names 
import random  
male_names = names.words('male.txt')
female_names = names.words('female.txt') 
labeled_male_names = [(str(name), 'male') for name in male_names]
labeled_female_names = [(str(name), 'female') for name in female_names]
# combine labeled male and labeled female names
all_labeled_names = labeled_male_names + labeled_female_names 
feature_set = [(name[-1], gender) for (name, gender) in all_labeled_names]
print("\nFirst 15 labeled names:") 
print((all_labeled_names[:15]))
print("\nLast letter of all the labeled names with the associated label:")
print((feature_set[:15]))

Sample Output:

First 15 labeled names:
[('Aamir', 'male'), ('Aaron', 'male'), ('Abbey', 'male'), ('Abbie', 'male'), ('Abbot', 'male'), ('Abbott', 'male'), ('Abby', 'male'), ('Abdel', 'male'), ('Abdul', 'male'), ('Abdulkarim', 'male'), ('Abdullah', 'male'), ('Abe', 'male'), ('Abel', 'male'), ('Abelard', 'male'), ('Abner', 'male')]

Last letter of all the labeled names with the associated label:
[('r', 'male'), ('n', 'male'), ('y', 'male'), ('e', 'male'), ('t', 'male'), ('t', 'male'), ('y', 'male'), ('l', 'male'), ('l', 'male'), ('m', 'male'), ('h', 'male'), ('e', 'male'), ('l', 'male'), ('d', 'male'), ('r', 'male')]

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Python NLTK program to print the first 15 random combine labeled male and labeled female names from names corpus.