Python Web Scraping: Retrieves an arbitary Wikipedia page of "Python" and creates a list of links on that page
Write a Python program to that retrieves an arbitary Wikipedia page of "Python" and creates a list of links on that page.
Sample Solution:
Python Code:
from urllib.request import urlopen
from bs4 import BeautifulSoup
html = urlopen("https://en.wikipedia.org/wiki/Python")
bsObj = BeautifulSoup(html)
for link in bsObj.findAll("a"):
if 'href' in link.attrs:
print(link.attrs['href'])
Sample Output:
#mw-head #p-search https://en.wiktionary.org/wiki/Python https://en.wiktionary.org/wiki/python #Snakes #Ancient_Greece #Media_and_entertainment #Computing #Engineering #Roller_coasters #Vehicles #Weaponry #See_also /w/index.php?title=Python&action=edit§ion=1 /wiki/Pythonidae /wiki/Python_(genus) /w/index.php?title=Python&action=edit§ion=2 /wiki/Python_(mythology) /wiki/Python_of_Aenus /wiki/Python_(painter) /wiki/Python_of_Byzantium /wiki/Python_of_Catana /w/index.php?title=Python&action=edit§ion=3 /wiki/Python_(film) /wiki/Pythons_2 /wiki/Monty_Python /wiki/Python_(Monty)_Pictures /w/index.php?title=Python&action=edit§ion=4 /wiki/Python_(programming_language) /wiki/CPython /wiki/CMU_Common_Lisp /wiki/PERQ#PERQ_3 /w/index.php?title=Python&action=edit§ion=5 /w/index.php?title=Python&action=edit§ion=6 /wiki/Python_(Busch_Gardens_Tampa_Bay) /wiki/Python_(Coney_Island,_Cincinnati,_Ohio) /wiki/Python_(Efteling) /w/index.php?title=Python&action=edit§ion=7 /wiki/Python_(automobile_maker) /wiki/Python_(Ford_prototype) /w/index.php?title=Python&action=edit§ion=8 /wiki/Colt_Python /wiki/Python_(missile) /w/index.php?title=Python&action=edit§ion=9 /wiki/Cython /wiki/Pyton /wiki/File:Disambig_gray.svg /wiki/Help:Disambiguation //en.wikipedia.org/w/index.php?title=Special:WhatLinksHere/Python&namespace=0 https://en.wikipedia.org/w/index.php?title=Python&oldid=845762125 /wiki/Help:Category /wiki/Category:Disambiguation_pages /wiki/Category:Disambiguation_pages_with_short_description /wiki/Category:All_article_disambiguation_pages /wiki/Category:All_disambiguation_pages /wiki/Category:Animal_common_name_disambiguation_pages /wiki/Special:MyTalk /wiki/Special:MyContributions /w/index.php?title=Special:CreateAccount&returnto=Python /w/index.php?title=Special:UserLogin&returnto=Python /wiki/Python /wiki/Talk:Python /wiki/Python /w/index.php?title=Python&action=edit /w/index.php?title=Python&action=history /wiki/Main_Page /wiki/Main_Page /wiki/Portal:Contents /wiki/Portal:Featured_content /wiki/Portal:Current_events /wiki/Special:Random https://donate.wikimedia.org/wiki/Special:FundraiserRedirector?utm_source=donate&utm_medium=sidebar&utm_campaign=C13_en.wikipedia.org&uselang=en //shop.wikimedia.org /wiki/Help:Contents /wiki/Wikipedia:About /wiki/Wikipedia:Community_portal /wiki/Special:RecentChanges //en.wikipedia.org/wiki/Wikipedia:Contact_us /wiki/Special:WhatLinksHere/Python /wiki/Special:RecentChangesLinked/Python /wiki/Wikipedia:File_Upload_Wizard /wiki/Special:SpecialPages /w/index.php?title=Python&oldid=845762125 /w/index.php?title=Python&action=info https://www.wikidata.org/wiki/Special:EntityPage/Q747452 /w/index.php?title=Special:CiteThisPage&page=Python&id=845762125 /w/index.php?title=Special:Book&bookcmd=book_creator&referer=Python /w/index.php?title=Special:ElectronPdf&page=Python&action=show-download-screen /w/index.php?title=Python&printable=yes https://commons.wikimedia.org/wiki/Category:Python https://af.wikipedia.org/wiki/Python https://als.wikipedia.org/wiki/Python https://bn.wikipedia.org/wiki/%E0%A6%AA%E0%A6%BE%E0%A6%87%E0%A6%A5%E0%A6%A8_(%E0%A6%A6%E0%A7%8D%E0%A6%AC%E0%A7%8D%E0%A6%AF%E0%A6%B0%E0%A7%8D%E0%A6%A5%E0%A6%A4%E0%A6%BE_%E0%A6%A8%E0%A6%BF%E0%A6%B0%E0%A6%B8%E0%A6%A8) https://be.wikipedia.org/wiki/Python https://bg.wikipedia.org/wiki/%D0%9F%D0%B8%D1%82%D0%BE%D0%BD_(%D0%BF%D0%BE%D1%8F%D1%81%D0%BD%D0%B5%D0%BD%D0%B8%D0%B5) https://cs.wikipedia.org/wiki/Python_(rozcestn%C3%ADk) https://da.wikipedia.org/wiki/Python https://de.wikipedia.org/wiki/Python https://eo.wikipedia.org/wiki/Pitono_(apartigilo) https://eu.wikipedia.org/wiki/Python_(argipena) https://fa.wikipedia.org/wiki/%D9%BE%D8%A7%DB%8C%D8%AA%D9%88%D9%86 https://fr.wikipedia.org/wiki/Python https://ko.wikipedia.org/wiki/%ED%8C%8C%EC%9D%B4%EC%84%A0 https://hr.wikipedia.org/wiki/Python_(razdvojba) https://io.wikipedia.org/wiki/Pitono https://id.wikipedia.org/wiki/Python https://ia.wikipedia.org/wiki/Python_(disambiguation) https://is.wikipedia.org/wiki/Python https://it.wikipedia.org/wiki/Python_(disambigua) https://he.wikipedia.org/wiki/%D7%A4%D7%99%D7%AA%D7%95%D7%9F https://ka.wikipedia.org/wiki/%E1%83%9E%E1%83%98%E1%83%97%E1%83%9D%E1%83%9C%E1%83%98_(%E1%83%9B%E1%83%A0%E1%83%90%E1%83%95%E1%83%90%E1%83%9A%E1%83%9B%E1%83%9C%E1%83%98%E1%83%A8%E1%83%95%E1%83%9C%E1%83%94%E1%83%9A%E1%83%9D%E1%83%95%E1%83%90%E1%83%9C%E1%83%98) https://kg.wikipedia.org/wiki/Mboma_(nyoka) https://la.wikipedia.org/wiki/Python_(discretiva) https://lb.wikipedia.org/wiki/Python https://hu.wikipedia.org/wiki/Python_(egy%C3%A9rtelm%C5%B1s%C3%ADt%C5%91_lap) https://mr.wikipedia.org/wiki/%E0%A4%AA%E0%A4%BE%E0%A4%AF%E0%A4%A5%E0%A5%89%E0%A4%A8_(%E0%A4%86%E0%A4%9C%E0%A5%8D%E0%A4%9E%E0%A4%BE%E0%A4%B5%E0%A4%B2%E0%A5%80_%E0%A4%AD%E0%A4%BE%E0%A4%B7%E0%A4%BE) https://nl.wikipedia.org/wiki/Python https://ja.wikipedia.org/wiki/%E3%83%91%E3%82%A4%E3%82%BD%E3%83%B3 https://no.wikipedia.org/wiki/Pyton https://pl.wikipedia.org/wiki/Pyton https://pt.wikipedia.org/wiki/Python_(desambigua%C3%A7%C3%A3o) https://ru.wikipedia.org/wiki/Python_(%D0%B7%D0%BD%D0%B0%D1%87%D0%B5%D0%BD%D0%B8%D1%8F) https://sd.wikipedia.org/wiki/%D8%A7%D8%B1%DA%99 https://sk.wikipedia.org/wiki/Python https://sh.wikipedia.org/wiki/Python https://fi.wikipedia.org/wiki/Python https://sv.wikipedia.org/wiki/Pyton https://th.wikipedia.org/wiki/%E0%B9%84%E0%B8%9E%E0%B8%97%E0%B8%AD%E0%B8%99 https://tr.wikipedia.org/wiki/Python https://uk.wikipedia.org/wiki/%D0%9F%D1%96%D1%84%D0%BE%D0%BD https://ur.wikipedia.org/wiki/%D9%BE%D8%A7%D8%A6%DB%8C%D8%AA%DA%BE%D9%88%D9%86 https://vi.wikipedia.org/wiki/Python https://zh.wikipedia.org/wiki/Python_(%E6%B6%88%E6%AD%A7%E4%B9%89) https://www.wikidata.org/wiki/Special:EntityPage/Q747452#sitelinks-wikipedia //en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License //creativecommons.org/licenses/by-sa/3.0/ //wikimediafoundation.org/wiki/Terms_of_Use //wikimediafoundation.org/wiki/Privacy_policy //www.wikimediafoundation.org/ https://wikimediafoundation.org/wiki/Privacy_policy /wiki/Wikipedia:About /wiki/Wikipedia:General_disclaimer //en.wikipedia.org/wiki/Wikipedia:Contact_us https://www.mediawiki.org/wiki/Special:MyLanguage/How_to_contribute https://wikimediafoundation.org/wiki/Cookie_statement //en.m.wikipedia.org/w/index.php?title=Python&mobileaction=toggle_view_mobile https://wikimediafoundation.org/ //www.mediawiki.org/ /usr/local/lib/python3.6/dist-packages/bs4/__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently. The code that caused this warning is on line 4 of the file /tmp/sessions/0f56b56f1170593f/main.py. To get rid of this warning, change code that looks like this: BeautifulSoup([your markup]) to this: BeautifulSoup([your markup], "lxml")
Flowchart:
Python Code Editor:
Have another way to solve this solution? Contribute your code (and comments) through Disqus.
Previous: Write a Python program to extract and display all the image links from en.wikipedia.org/wiki/Peter_Jeffrey_(RAAF_officer)
Next: Write a Python program to check whether a page contains a title or not.
What is the difficulty level of this exercise?
- Weekly Trends and Language Statistics
- Weekly Trends and Language Statistics