Python BeautifulSoup: Retrieve the HTML code of the title, its text, and the HTML code of its parent
Write a Python program to retrieve the HTML code of the title, its text, and the HTML code of its parent.
Sample Solution:
Python Code:
import requests
from bs4 import BeautifulSoup
url = 'https://www.python.org/'
reqs = requests.get(url)
soup = BeautifulSoup(reqs.text, 'lxml')
print("title")
print(soup.title)
print("title text")
print(soup.title.text)
print("Parent content of the title:")
print(soup.title.parent)
Sample Output:
title <title>Welcome to Python.org</title> title text Welcome to Python.org Parent content of the title: <head> <meta charset="utf-8"/> <meta content="IE=edge" http-equiv="X-UA-Compatible"/> <link href="//ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js" rel="prefetch"/> <meta content="Python.org" name="application-name"/> <meta content="The official home of the Python Programming Language" name="msapplication-tooltip"/> <meta content="Python.org" name="apple-mobile-web-app-title"/> <meta content="yes" name="apple-mobile-web-app-capable"/> <meta content="black" name="apple-mobile-web-app-status-bar-style"/> <meta content="width=device-width, initial-scale=1.0 " name="viewport"/> <meta content="True" name="HandheldFriendly"/> <meta content="telephone=no" name="format-detection"/> <meta content="on" http-equiv="cleartype"/> <meta content="false" http-equiv="imagetoolbar"/> <script src="/static/js/libs/modernizr.js"></script> <link href="/static/stylesheets/style.3dbbbf7ee488.css" rel="stylesheet" title="default" type="text/css"/> <link href="/static/stylesheets/mq.3ae8e02ece5b.css" media="not print, braille, embossed, speech, tty" rel="stylesheet" type="text/css"/> <!--[if (lte IE 8)&(!IEMobile)]> <link href="/static/stylesheets/no-mq.fcf414dc68a3.css" rel="stylesheet" type="text/css" media="screen" /> <![endif]--> <link href="/static/favicon.ico" rel="icon" type="image/x-icon"/> <link href="/static/apple-touch-icon-144x144-precomposed.png" rel="apple-touch-icon-precomposed" sizes="144x144"/> <link href="/static/apple-touch-icon-114x114-precomposed.png" rel="apple-touch-icon-precomposed" sizes="114x114"/> <link href="/static/apple-touch-icon-72x72-precomposed.png" rel="apple-touch-icon-precomposed" sizes="72x72"/> <link href="/static/apple-touch-icon-precomposed.png" rel="apple-touch-icon-precomposed"/> <link href="/static/apple-touch-icon-precomposed.png" rel="apple-touch-icon"/> <meta content="/static/metro-icon-144x144-precomposed.png" name="msapplication-TileImage"/><!-- white shape --> <meta content="#3673a5" name="msapplication-TileColor"/><!-- python blue --> <meta content="#3673a5" name="msapplication-navbutton-color"/> <title>Welcome to Python.org</title> <meta content="The official home of the Python Programming Language" name="description"/> <meta content="Python programming language object oriented web free open source software license documentation download community" name="keywords"/> <meta content="website" property="og:type"/> <meta content="Python.org" property="og:site_name"/> <meta content="Welcome to Python.org" property="og:title"/> <meta content="The official home of the Python Programming Language" property="og:description"/> <meta content="https://www.python.org/static/opengraph-icon-200x200.png" property="og:image"/> <meta content="https://www.python.org/static/opengraph-icon-200x200.png" property="og:image:secure_url"/> <meta content="https://www.python.org/" property="og:url"/> <link href="/static/humans.txt" rel="author"/> <link href="https://www.python.org/dev/peps/peps.rss/" rel="alternate" title="Python Enhancement Proposals" type="application/rss+xml"/> <link href="https://www.python.org/jobs/feed/rss/" rel="alternate" title="Python Job Opportunities" type="application/rss+xml"/> <link href="https://feeds.feedburner.com/PythonSoftwareFoundationNews" rel="alternate" title="Python Software Foundation News" type="application/rss+xml"/> <link href="https://feeds.feedburner.com/PythonInsider" rel="alternate" title="Python Insider" type="application/rss+xml"/> <script type="application/ld+json"> { "@context": "https://schema.org", "@type": "WebSite", "url": "https://www.python.org/", "potentialAction": { "@type": "SearchAction", "target": "https://www.python.org/search/?q={search_term_string}", "query-input": "required name=search_term_string" } } </script> <script type="text/javascript"> var _gaq = _gaq || []; _gaq.push(['_setAccount', 'UA-39055973-1']); _gaq.push(['_trackPageview']); (function() { var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); })(); </script> </head>
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Python program to retrieve all descendants of the body tag from a given web page.
Next: Write a Python program to find and print all li tags of a given web page.
What is the difficulty level of this exercise?
Test your Programming skills with w3resource's quiz.
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics