首页 > 解决方案 > 将特殊字符写入 csv 文件时出现问题

问题描述

我正在将网页的爬网输出写入 CSV 文件。但是,很少有特殊字符(例如“连字符”)无法正确解析。

原文:Amazon Forecast - 现已全面上市

csv 中的结果:Amazon Forecast – 现在普遍可用

我尝试了下面的代码

from bs4 import BeautifulSoup
from datetime import date
import requests
import csv
source = requests.get('https://aws.amazon.com/blogs/aws/').text
soup = BeautifulSoup(source, 'lxml')
# csv_file = open('aitrendsresults.csv', 'w')
csv_file = open('aws_cloud_results.csv', 'w' , encoding = 'utf8' )
csv_writer = csv.writer(csv_file)
csv_writer.writerow(['title','img','src','summary'])
match = soup.find_all('div',class_='lb-row lb-snap')
for n in match:
 imgsrc= n.div.img.get('src')
 titlesrc= n.find('div',{'class':'lb-col lb-mid-18 lb-tiny-24'})
 titletxt= titlesrc.h2.text
 anchortxt= titlesrc.a.get('href')
 sumtxt= titlesrc.section.p.text
 print(sumtxt)
 csv_writer.writerow([titletxt,imgsrc,anchortxt,sumtxt])
csv_file.close()

你能帮我得到与上面提供的原始文本相同的文本吗?

标签: pythonbeautifulsoup

解决方案


I've been working with BS as well and I think you've only made a minor mistake. In line 8, where you open the csv file, the encoding should be "UTF-8" instead of "utf8". See if that helps.


推荐阅读