w3resource

Python BeautifulSoup: Parse tree into a nicely formatted Unicode string, with a separate line for each HTML/XML tag and string


Write a Python program to create a Beautiful Soup parse tree into a nicely formatted Unicode string, with a separate line for each HTML/XML tag and string.

Sample Solution:

Python Code:

from bs4 import BeautifulSoup
str1 = "<p>Some<b>bad<i>HTML Code</i></b></p>"
print("Original string:")
print(str1)
soup = BeautifulSoup("<p>Some<b>bad<i>HTML Code</i></b></p>", "xml")
print("\nFormatted Unicode string:")
print(soup.prettify())

Sample Output:

Original string:
<p>Some<b>bad<i>HTML Code</i></b></p>

Formatted Unicode string:
<?xml version="1.0" encoding="utf-8"?>
<p>
 Some
 <b>
  bad
  <i>
   HTML Code
  </i>
 </b>
</p>

Python Code Editor:

Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Python program to print the element(s) that has a specified id of a given web page.
Next: Write a Python program to find the first tag with a given attribute value in an html document.

What is the difficulty level of this exercise?

Test your Programming skills with w3resource's quiz.



Follow us on Facebook and Twitter for latest update.