NLTK corpus: Extract the last letter of all the labeled names and create a new array with the last letter of each name
Write a Python NLTK program to extract the last letter of all the labeled names and create a new array with the last letter of each name and the associated label.
Sample Solution:
Python Code :
from nltk.corpus import names
import random
male_names = names.words('male.txt')
female_names = names.words('female.txt')
labeled_male_names = [(str(name), 'male') for name in male_names]
labeled_female_names = [(str(name), 'female') for name in female_names]
# combine labeled male and labeled female names
all_labeled_names = labeled_male_names + labeled_female_names
feature_set = [(name[-1], gender) for (name, gender) in all_labeled_names]
print("\nFirst 15 labeled names:")
print((all_labeled_names[:15]))
print("\nLast letter of all the labeled names with the associated label:")
print((feature_set[:15]))
Sample Output:
First 15 labeled names: [('Aamir', 'male'), ('Aaron', 'male'), ('Abbey', 'male'), ('Abbie', 'male'), ('Abbot', 'male'), ('Abbott', 'male'), ('Abby', 'male'), ('Abdel', 'male'), ('Abdul', 'male'), ('Abdulkarim', 'male'), ('Abdullah', 'male'), ('Abe', 'male'), ('Abel', 'male'), ('Abelard', 'male'), ('Abner', 'male')] Last letter of all the labeled names with the associated label: [('r', 'male'), ('n', 'male'), ('y', 'male'), ('e', 'male'), ('t', 'male'), ('t', 'male'), ('y', 'male'), ('l', 'male'), ('l', 'male'), ('m', 'male'), ('h', 'male'), ('e', 'male'), ('l', 'male'), ('d', 'male'), ('r', 'male')]
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics