首页 > 解决方案 > Web Scraping (Python 3) 的代码中有语法错误?

问题描述

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq

my_url = 'https://www.flipkart.com/search?q=iphone+12&sid=tyy%2C4io&as=on&as-show=on&otracker=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&otracker1=AS_QueryStore_OrganicAutoSuggest_1_6_na_na_na&as-pos=1&as-type=HISTORY&suggestionId=iphone+12%7CMobiles&requestId=71ed5a8e-4348-4fef-9af8-43b7be8c4d83'

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")

containers = page_soup.findAll("div", {"class": "_13oc-S"})
#print(len(containers)) "will tell number of products on the respected page"
#print(len(containers))

#print(soup.prettify(containers[0])) "will bring the page in the organised manner"
#print(soup.prettify(containers[0]))

container=containers[0]
#print(container.div.img["alt"]) "will display the name of the respected product"
#print(container.div.img["alt"])

#price=container.findAll("div",{"class":"col col-5-12 nlI3QM"}) "will tell the price of the respect project"
price=container.findAll("div",{"class":"col col-5-12 nlI3QM"})
#print(price[0].text)

ratings=container.findAll("div",{"class":"gUuXy-"})
#print(ratings[0].text)

#Making a file
filename="products.csv"
f= open(filename, "w")

#Naming the headers
headers="Product_Name,Pricing,Ratings\n"
f.write(headers)

for container in containers:
    product_name = container.div.img["alt"]

    price_container = container.findAll("div", {"class": "col col-5-12 nlI3QM"})
    price = price_container[0].text.strip()

    rating_container = container.findAll("div", {"class":"gUuXy-"})
    rating = rating_container[0].text 

    #print("product_name:" + product_name)
    #print("price:" + price)
    #print("ratings:" + rating)

    #string parsing
    trim_price = ''.join(price.split(','))
    rm_rupee = trim_price.split("&#8377")
    add_rs_price = "Rs." + rm_rupee[0]
    split_price = add_rs_price.split('E')
    final_price = split_price[0]

    split_rating = rating.split(" ")
    final_rating = split_rating[0]

    print(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\n")
    f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\n")

f.close()

f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\n")

在此特定行中有语法错误,我想制作一个 .CSV 文件,但产品没有出现在受尊重的文件中。语法错误是 -: 发生异常:UnicodeEncodeError 'charmap' codec can't encode character '\u20b9' in position 35: character maps to File "D:\Visual Code Folder\Python\Scraping_Flipkart.py", line 61,在 f.write(product_name.replace(",", "|") + "," + final_price + "," + final_rating + "\n")

标签: pythonhtmlcss

解决方案


替换这个

f= open(filename, "w")

有了这个

import io
f = io.open(filename, "w", encoding="utf-8")

使用 io 可以向后兼容 Python 2。

如果您只需要支持 Python 3,您可以使用内置的 open 函数:

with open(fname, "w", encoding="utf-8") as f:
    f.write(html)

推荐阅读