python - Web Scraping - printing values together - Python
问题描述
So I'm trying to scrape CS:GO skins, I'm trying to return: Skin name, Price and collection - in that order.
This is one of many ways I have tried it.
from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
def webscrape():
url = "https://csgostash.com/weapon/AWP"
res = requests.get(url = url)
soup = BeautifulSoup(res.text, "html.parser")
titles = soup.find_all('div', class_="well result-box nomargin")
prices = soup.find_all('div', class_="price")
collection = soup.find_all('div', class_="collection")
for title in titles:
title = title.find('a')
if title:
title = title.text
for price in prices:
price = price.find('p')
if price:
price = price.text
for cases in collection:
cases = cases.find('p')
if price:
cases = cases.text
print(title.text, price.text, collection.text)
webscrape()
This returns:
print(title.text, price.text, collection.text)
AttributeError: 'NoneType' object has no attribute 'text'
I want it to return the three values in order. E.G. Containment Breach '\n' A$40.57 -A$271.90'\n' Shattered Web Case
and so on. Some of the skins have 2 Price sets, and I want both price sets to print out.
I have gotten it working more to show what I'm struggling with
from bs4 import BeautifulSoup
import requests
import urllib3
urllib3.disable_warnings()
def webscrape():
url = "https://csgostash.com/weapon/AWP"
res = requests.get(url = url)
soup = BeautifulSoup(res.text, "html.parser")
names = " "
price = " "
cases = " "
titles = soup.find_all('div', class_="well result-box nomargin")
prices = soup.find_all('div', class_="price")
collection = soup.find_all('div', class_="collection")
for name in titles:
a_field = name.find('a')
if a_field:
names = a_field.text + '\n' + names
for money in prices:
p_field = money.find('p')
if p_field:
price = p_field.text + '\n' + price
for case in collection:
case_field = case.find('p')
if case_field:
cases = case_field.text + '\n' + cases
print(names, price, cases)
webscrape()
This prints all the information I am looking for on the webpage but i want the information grouped together, like i want the prices and the collection for the skin to print under the name of the skin. Right now it prints all the name, then all the prices, then all the collections.
解决方案
titles = soup.find_all('div', class_="well result-box nomargin")
for title in titles:
title = title.find('a')
if title:
title = title.text
您正在覆盖循环的每次迭代中的数据;我根本不清楚你认为你在做什么。我看到这个工作的唯一方法是如果你的最终迭代找到你想要的......在这种情况下作为你找到的最后一个值title
退出。text
最后,您尝试获取that.text
的属性。这几乎肯定会以某种不受欢迎的方式失败。
为了得到您看到的错误,最后一项titles
确实包含“a”,并且具有text
属性None
;稍后,当您尝试提取 的属性时None
,您会收到指示的错误。
相反,尝试
titles = soup.find_all('div', class_="well result-box nomargin")
for title in titles:
a_field = title.find('a')
if a_field:
break
一旦找到所需的属性,这将使您退出搜索循环。
推荐阅读
- javascript - 通过单击按钮调用 API 端点并在客户端打印消息
- c# - 从 ASP.NET CORE 3.0 迁移到 3.1
- python - 在 'localhost:3306' 与 MySQL 服务器的连接丢失,系统错误:连接不可用
- kubernetes - 将 kubernetes 部署缩减到 0 并缩减到原始副本集数量
- java - Gradle 测试夹具插件和核心模块依赖项
- javascript - 使用 bootstrap 3 popover 内表未显示在按钮顶部
- oracle - Oracle timestein 问题(日志标记)等待闩锁“日志链插入
- javascript - promis.all 后函数中的行不执行
- c# - 在 Asp.Net Core 3.1 中处理 SignalR 集线器中的所有异常
- windows - 如何检查jenkins是否安装在windows中?