python - Webscraping 没有找到 Python 中的所有类
问题描述
我正在尝试使用 bs4 Python 从特定网站中提取用户信息和日期,但我的代码没有从网站上找到所有类。
代码如下
url = "https://www.expeditionforum.com/threads/distance-indication-feature.34452/"
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
title = soup.find('h1')
date=soup.findAll('a',attrs={"class":"datePermalink"})
name=soup.findAll('a',attrs={"class":"username"})
它能够检测到 h1 标签,但不能检测到其他标签。你能建议我做错什么吗?先感谢您
解决方案
from bs4 import BeautifulSoup
import requests
r = requests.get(
"https://www.expeditionforum.com/threads/distance-indication-feature.34452/")
soup = BeautifulSoup(r.content, 'html.parser')
table = soup.findAll("div", class_="messageUserInfo")
dates = soup.findAll("a", class_="datePermalink")
for item1, item2 in zip(table, dates):
print("UserName: {:<15}, Date: {}".format(
item1.a.img.get('alt'), item2.text))
输出:
UserName: jgudnason , Date: Jan 19, 2018 at 11:54 AM
UserName: ExpeditionAndy , Date: Jan 19, 2018 at 7:29 PM
UserName: zarga , Date: Jan 19, 2018 at 10:34 PM
UserName: ExpeditionAndy , Date: Jan 19, 2018 at 11:01 PM
UserName: dlcorbett , Date: Jan 20, 2018 at 7:06 AM
UserName: 17LimitedExpy , Date: Jan 20, 2018 at 7:12 AM
UserName: AmpForE , Date: Jan 21, 2018 at 3:07 AM
UserName: dlcorbett , Date: Jan 21, 2018 at 3:40 AM
UserName: zarga , Date: Jan 21, 2018 at 12:13 PM
UserName: jgudnason , Date: Jan 21, 2018 at 12:29 PM
推荐阅读
- video - VP9 的每帧元数据
- python - 在 Keras 中实现对抗性损失
- c - 附加到进程后,如何检查tracee是否在系统调用中?
- docker - 如何在 alpine linux 中安装 Go
- selenium - Selenium-如何找到 Web Element 的定位器?
- python - 用不同DataFrames Python Pandas的经纬度计算公里
- angular - 使用Angular2将所有复选框更改为真或假
- java - 我无法使用 AssertJ 测试 Swing GUI
- android - 退出全屏使应用程序空白
- javascript - jQuery从变量中获取值