python - google search html doesn't contain div id='resultStats'
问题描述
I'm trying to get the number of search results of a google search, which looks like this in the html, if i just save it from the browser:
<div id="resultStats">About 8,660,000,000 results<nobr> (0.49 seconds) </nobr></div>
But the HTML retrieved by python looks like a mobile website when I open it in a browser and it doesn't contain 'resultStats'.
I already tried (1) adding parameters to the URL like https://www.google.com/search?client=firefox-b-d&q=test
and (2) copying a complete URL from a browser, but it didn't help.
import requests
from bs4 import BeautifulSoup
import re
def google_results(query):
url = 'https://www.google.com/search?q=' + query
html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser')
div = soup.find('div', id='resultStats')
return int(''.join(re.findall(r'\d+', div.text.split()[1])))
print(google_results('test'))
Error:
Traceback: line 11, in google_results
return int(''.join(re.findall(r'\d+', div.text.split()[1])))
AttributeError: 'NoneType' object has no attribute 'text'
解决方案
解决方案是添加标题(谢谢,约翰):
import requests
from bs4 import BeautifulSoup
import re
def google_results(query):
url = 'https://www.google.com/search?q=' + query
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.0'
}
html = requests.get(url, headers=headers).text
soup = BeautifulSoup(html, 'html.parser')
div = soup.find('div', id='resultStats')
return int(''.join(re.findall(r'\d+', div.text.split()[1])))
print(google_results('test'))
输出:
9280000000
推荐阅读
- angular - 逻辑图的构建图编辑器
- javascript - Testcafe 无法识别 React
- xml - 在 XSLT 2 中使用 for-each-group 和 group-adjacent 进行分组和包装
- node.js - 使用 Axios 从 Express 应用程序返回流——“提供的值‘流’不是 XMLHttpRequestResponseType 类型的有效枚举值。”
- c# - 使用 CSVHelper 从 HttpResponseMessage 解析 CSV
- javascript - .trigger() jquery 函数导致 Uncaught RangeError: Maximum call stack size exceeded
- angular - Angular 和 Firebase:获取按发布日期排序的列表?
- python - python中的内存错误TFIDF余弦相似度
- mysql - Dapper 与 MySQL 和 ASP.NET CORE 2 - 插入模型失败
- r - 根据向量移动行值