python - Having issues with my first python web scraper part 2 (The Sequel)
问题描述
I am trying to write a web scraper to take information from a database on supreme clothing called supremecommunity.com I made a post about it and it was not working, got some great help, and now it is almost working.
The code works for the most part but it starts having issues after Fall-Winter 17'
This is the error message I got in my Jupiter notebook.
UnicodeEncodeError Traceback (most recent call last) in 24 upvote = card.select_one('.progress-bar-success > span').get_text(strip=True) 25 downvote = card.select_one('.progress-bar-danger > span').get_text(strip=True) ---> 26 writer.writerow([item_name,item_image,upvote,downvote]) 27 print(item_name,item_image,upvote,downvote)
~\Anaconda3\lib\encodings\cp1252.py in encode(self, input, final) 17 class IncrementalEncoder(codecs.IncrementalEncoder): 18 def encode(self, input, final=False): ---> 19 return codecs.charmap_encode(input,self.errors,encoding_table)[0] 20 21 class IncrementalDecoder(codecs.IncrementalDecoder):
UnicodeEncodeError: 'charmap' codec can't encode character '\u0392' in position 0: character maps to
Any advice would be greatly appreciated.
import csv
import requests
from bs4 import BeautifulSoup
base = 'https://www.supremecommunity.com{}'
links = ['https://www.supremecommunity.com/season/fall-winter2011/overview/','https://www.supremecommunity.com/season/spring-summer2012/overview/','https://www.supremecommunity.com/season/fall-winter2012/overview/',
'https://www.supremecommunity.com/season/spring-summer2013/overview/','https://www.supremecommunity.com/season/fall-winter2013/overview/','https://www.supremecommunity.com/season/spring-summer2014/overview/',
'https://www.supremecommunity.com/season/fall-winter2014/overview/','https://www.supremecommunity.com/season/spring-summer2015/overview/','https://www.supremecommunity.com/season/fall-winter2015/overview/',
'https://www.supremecommunity.com/season/spring-summer2016/overview/','https://www.supremecommunity.com/season/fall-winter2016/overview/','https://www.supremecommunity.com/season/spring-summer2017/overview/',
'https://www.supremecommunity.com/season/fall-winter2017/overview/', 'https://www.supremecommunity.com/season/spring-summer2018/overview/','https://www.supremecommunity.com/season/fall-winter2018/overview/',
'https://www.supremecommunity.com/season/spring-summer2019/overview/','https://www.supremecommunity.com/season/fall-winter2019/overview/']
with open("supremecommunity.csv","w",newline="") as f:
writer = csv.writer(f)
writer.writerow(['item_name','item_image','upvote','downvote'])
for link in links:
r = requests.get(link,headers={"User-Agent":"Mozilla/5.0"})
soup = BeautifulSoup(r.text,"lxml")
for card in soup.select('[class$="d-card"]'):
item_name = card.select_one('.card__top')['data-itemname']
item_image = base.format(card.select_one('img.prefill-img').get('data-src'))
upvote = card.select_one('.progress-bar-success > span').get_text(strip=True)
downvote = card.select_one('.progress-bar-danger > span').get_text(strip=True)
writer.writerow([item_name,item_image,upvote,downvote])
print(item_name,item_image,upvote,downvote)
解决方案
推荐阅读
- javascript - 如何添加具有不同元素的数组并使用javascript添加重复元素值
- c++ - 指向函数的 C++ 复制构造函数成员指针
- app-store - 我们是否需要相同的捆绑包 ID 才能将更新推送到应用商店
- php - 随机数生成器是原子的吗
- android - 我想要一个带有关闭按钮的自定义对话
- reactjs - TypeError: undefined is not an object(evalating '_props.listMessagesQuery.listMessages') in ReactNative
- c# - 蓝牙串行端口 (SPP) 传入端口创建
- android - Recycler View Not Reading Big Volley data with params in android
- amazon-web-services - 在 AWS 上按顺序激活 2 个或更多数据管道的最佳方法是什么?
- sql - Gettint 尝试在删除行 sql 后捕获事件