NLTK Tokenize: Find parenthesized expressions in a given string and divides into a sequence of substrings
NLTK Tokenize: Exercise-9 with Solution
Write a Python NLTK program to find parenthesized expressions in a given string and divides the string into a sequence of substrings.
Sample Solution:
Python Code:
from nltk.tokenize import SExprTokenizer
text = '(a b (c d)) e f (g)'
print("\nOriginal Tweet:")
print(text)
print(SExprTokenizer().tokenize(text))
text = '(a b) (c d) e (f g)'
print("\nOriginal Tweet:")
print(text)
print(SExprTokenizer().tokenize(text))
text = '[(a b (c d)) e f (g)]'
print("\nOriginal Tweet:")
print(text)
print(SExprTokenizer().tokenize(text))
print(text)
print(SExprTokenizer().tokenize(text))
text = '{a b {c d}} e f {g}'
print("\nOriginal Tweet:")
print(text)
print(SExprTokenizer().tokenize(text))
Sample Output:
Original Tweet: (a b (c d)) e f (g) ['(a b (c d))', 'e', 'f', '(g)'] Original Tweet: (a b) (c d) e (f g) ['(a b)', '(c d)', 'e', '(f g)'] Original Tweet: [(a b (c d)) e f (g)] ['[', '(a b (c d))', 'e', 'f', '(g)', ']'] [(a b (c d)) e f (g)] ['[', '(a b (c d))', 'e', 'f', '(g)', ']'] Original Tweet: {a b {c d}} e f {g} ['{a b {c d}} e f {g}']
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
It will be nice if you may share this link in any developer community or anywhere else, from where other developers may find this content. Thanks.
https://198.211.115.131/python-exercises/nltk/nltk-tokenize-exercise-9.php
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics