python - 无法使用 python urllib 解析 Google 标题
问题描述
我无法解析谷歌搜索结果:
def extracter(url,key,change):
if " " in key:
key=key.replace(" ",str(change))
url=url+str(key)
response=ur.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
sauce =ur.urlopen(response).read()
soup=bs(sauce,"html.parser")
return soup
def google(keyword):
soup = extracter("https://www.google.com/search?q=",str(keyword),"+")
search_result = soup.findAll("h3",attrs={"class":"LC20lb"})
print(search_result)
google("tony stark")
输出:
[]
解决方案
我只是更改了标题并且它起作用了:
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36'}
结果:
[<h3 class="LC20lb"><span dir="ltr">Tony Stark (Marvel Cinematic Universe) - Wikipedia</span></h3>, <h3 class="LC20lb"><span dir="ltr">Tony Stark / Iron Man - Wikipedia</span></h3>, <h3 class="LC20lb"><span dir="ltr">Iron Man | Marvel Cinematic Universe Wiki | FANDOM ...</span></h3>, <h3 class="LC20lb"><span dir="ltr">Tony Stark (Earth-199999) | Iron Man Wiki | FANDOM ...</span></h3>, <h3 class="LC20lb"><span dir="ltr">Is Tony Stark Alive As AI? Marvel Fans Say Tony Stark ...</span></h3>, <h3 class="LC20lb"><span dir="ltr">'Avengers: Endgame' Might Not Have Been the End of Tony ...</span></h3>, <h3 class="LC20lb"><span dir="ltr">Robert Downey Jr to RETURN to MCU as AI Tony Stark - ...</span></h3>, <h3 class="LC20lb"><span dir="ltr">Avengers Endgame theory: Tony Stark is backed up as AI ...</span></h3>]
推荐阅读
- javascript - 如何在 forEach 中调用函数?
- java - Android Studio 3.1.4:缺少类和不正确的蓝图显示
- fabricjs - 缩放或拖动时织物对象消失
- c# - 主列中不同值的DataTable约束异常
- doctrine - 数组集合:set 方法是如何工作的?
- python - python中两个列表的笛卡尔积
- http - Http.post() 正在将有效负载从对象更改为函数
- python - 设置边界和转换变量(布尔 python)
- c++ - Visual Studio C++ 无法从另一个目录中找到头文件
- javascript - 如何禁用或阻止 Web 表单上的空格键输入