python - 美汤BS4标签导航
问题描述
我正在尝试导航此 API 的输出以获取响应中的标签。但是在尝试使用标准方法导航到标签后,我得到一个空响应。
from bs4 import BeautifulSoup
import urllib.request
import gzip
import io
headers = {'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Encoding': 'gzip, deflate',
'Accept-Language': 'en-US,en;q=0.5',
}
url = 'https://api.stackexchange.com/2.2/search/advanced?order=desc&sort=activity&q=' + 'AKIAJQVBDUUDGLXOEKYA' + '&site=stackoverflow'
req = urllib.request.Request(url, headers=headers)
response = urllib.request.urlopen(req)
time.sleep(3)
if response.info().get('Content-Encoding') == 'gzip':
pagedata = gzip.decompress(response.read())
elif response.info().get('Content-Encoding') == 'deflate':
pagedata = response.read()
elif response.info().get('Content-Encoding'):
print('Encoding type unknown')
else:
pagedata = response.read()
soup = BeautifulSoup(pagedata, "lxml")
print(soup)
汤的输出:
<html><body><p>{"items":[{"tags":["c#","aws-lambda","aws-serverless"],"owner":{"reputation":188,"user_id":1395211,"user_type":"registered","accept_rate":62,"profile_image":"https://i.stack.imgur.com/WylN7.png?s=128&g=1","display_name":"Mostafa Fallah","link":"https://stackoverflow.com/users/1395211/mostafa-fallah"},"is_answered":true,"view_count":40,"accepted_answer_id":54550236,"answer_count":1,"score":2,"last_activity_date":1549445444,"creation_date":1540222981,"question_id":52933098,"link":"https://stackoverflow.com/questions/52933098/deploying-aws-serverless-lambda-application-with-amazonserverlessapplicationrepo","title":"Deploying AWS Serverless lambda Application with AmazonServerlessApplicationRepositoryClient does not work?"}],"has_more":false,"quota_max":300,"quota_remaining":275}</p></body></html>
这是我用来导航的:
tags = soup.find_all('p')
t = tags[0]
print(type(t))
print(t.attrs)
但是即使我可以在标签中看到东西,这也会返回并清空 dict {}。不确定我是否做得对。提前谢谢你的帮助。
解决方案
json 格式的项目,因此您可以进行 json 转储和循环项目。
import requests
url = 'https://api.stackexchange.com/2.2/search/advanced?order=desc&sort=activity&q=' + 'AKIAJQVBDUUDGLXOEKYA' + '&site=stackoverflow'
s=requests.get(url).json()
data = [(item['tags'],item['owner'],item['title']) for item in s['items']]
print(data)
输出:
[([python', beautifulsoup'], {user_id': 7309225, profile_image': https://graph.facebook.com/10207802462833592/picture?type=large', user_type': registered', reputation': 532, link': https://stackoverflow.com/users/7309225/digvijay-sawant', accept_rate': 100, display_name': Digvijay Sawant'}, Beautiful Soup BS4 tag navigation'), ([c#', aws-lambda', aws-serverless'], {user_id': 1395211, profile_image': https://i.stack.imgur.com/WylN7.png?s=128&g=1', user_type': registered', reputation': 188, link': https://stackoverflow.com/users/1395211/mostafa-fallah', accept_rate': 62, display_name': Mostafa Fallah'}, Deploying AWS Serverless lambda Application with AmazonServerlessApplicationRepositoryClient does not work?')]
推荐阅读
- c# - 对 params dynamic[] 的内容求和时出错
- c - 是否可以在 Mac 上编译可在 Windows 上编译的 c 程序?
- tsql - Crystal Reports:If/Else 公式给出错误结果
- ios - pow 使用带负指数的 Double 或 Decimal 返回不同的结果
- javascript - 如何让我的删除按钮从数据库中删除数据?
- arrays - Matlab中的N维数组索引:在中间查找数组
- python - Pymongo:如果已经存在则返回 MongoClient
- python - 尝试导入 gspread 时没有名为“gspread”的模块
- c# - 反序列化一个复杂的json对象c#
- react-native - 如何防止 react-navigation 在 StackAction.reset 期间显示过渡动画?