以CSDN首页为例,代码如下:
>>> import requests
>>> r=requests.get("https://www.csdn.net")
>>> demo=r.text
>>> from bs4 import BeautifulSoup
>>> soup=BeautifulSoup(demo,"html.parser")
>>> for tag in soup.find_all(True):
print(tag.name)
html
head
title
meta
meta
meta
meta
meta
script
link
meta
meta
script
script
script
script
script
link
link
body
div
div
div
div
div
div
div
div
div
div
div
ul
li
a
imgTraceback (most recent call last):
File "<pyshell#26>", line 2, in <module>
print(tag.name)
KeyboardInterrupt
>>>