NLTK corpus: Find the number of male and female names in the names corpus
Write a Python NLTK program to find the number of male and female names in the names corpus. Print the first 10 male and female names.
Note: The names corpus contains a total of around 2943 male (male.txt) and 5001 female (female.txt) names. It’s compiled by Kantrowitz, Ross.
Sample Solution:
Python Code :
from nltk.corpus import names
print("\nNumber of male names:")
print (len(names.words('male.txt')))
print("\nNumber of female names:")
print (len(names.words('female.txt')))
male_names = names.words('male.txt')
female_names = names.words('female.txt')
print("\nFirst 10 male names:")
print (male_names[0:15])
print("\nFirst 10 female names:")
print (female_names[0:15])
Sample Output:
Number of male names: 2943 Number of female names: 5001 First 10 male names: ['Aamir', 'Aaron', 'Abbey', 'Abbie', 'Abbot', 'Abbott', 'Abby', 'Abdel', 'Abdul', 'Abdulkarim', 'Abdullah', 'Abe', 'Abel', 'Abelard', 'Abner'] First 10 female names: ['Abagael', 'Abagail', 'Abbe', 'Abbey', 'Abbi', 'Abbie', 'Abby', 'Abigael', 'Abigail', 'Abigale', 'Abra', 'Acacia', 'Ada', 'Adah', 'Adaline']
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Python NLTK program to compare the similarity of two given verbs.
Next: Write a Python NLTK program to print the first 15 random combine labeled male and labeled female names from names corpus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics