python - Scraping does not export to Excel
问题描述
I am trying my first time to scrape the information from website and export it to excel file. However, not the whole information is scraped, nor the file is created for the export.
This is what I get in anaconda:
(base) C:\Windows\system32>firstwebscrape.py
brand: []
product_name: ASRock Radeon RX 5700 XT DirectX 12 RX 5700 XT TAICHI X 8G OC+ Video Card
product_price: €446,99
Here is the code
from bs4 import BeautifulSoup as soup
my_url = 'https://www.newegg.com/global/lt-en/Video-Cards-Video-Devices/Category/ID-38?Tpk=graphic%20card'
#opening up the connection grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
#HTML parser
page_soup = soup(page_html, "html.parser")
#grabs all containers
containers = page_soup.findAll("div",{"class":"item-container"})
filename= "123.csv"
f = open(filename, "w")
headers = "brand, product_name, product_price\n"
f.write(headers)
for container in containers:
brand = container.findAll("a",{"class":"title"})
title_container = container.findAll("a",{"class":"item-title"})
product_name = title_container[0].text
price_container = container.findAll("li",{"class":"price-current"})
product_price = price_container[0].text.strip()
print("brand: ", brand)
print("product_name: " + product_name)
print("product_price: " + product_price)
f.write(str(brand) + "," + product_name.replace(",", "|") + "," + product_price + "\n")
f.close()
解决方案
Your code runs fine. Just correct this in your loop:
for container in containers:
brand = container.findAll("a",{"class":"title"})
title_container = container.findAll("a",{"class":"item-title"})
product_name = title_container[0].text
price_container = container.findAll("li",{"class":"price-current"})
product_price = price_container[0].text.strip()
# these code lines have to be in your for loop!
print("brand: ", brand)
print("product_name: " + product_name)
print("product_price: " + product_price)
f.write(str(brand) + "," + product_name.replace(",", "|") + "," + product_price + "\n")
You want to print and save for every item in your iteration over containers
. Otherwise only the last item gets saved to your CSV.
推荐阅读
- python - Python中具有相同权重的顶点的加权子图程序
- flutter - 带参数的命名路由
- node.js - node.js 如何处理跳过的可选参数
- java - 在 Spring-Boot 上启动 Web 应用程序时出错
- flutter - 强制 Flutter 手动选择 iOS 团队
- java - 在 ColdFusion 11 中使用 ImageReadBase64() 读取大图像时出现“java.lang.OutOfMemoryError:Java 堆空间”
- sql-server - 运行 SQL Server 存储过程的 Power Query 数据类型转换问题
- ios - Xcode 12.2 中的错误:配置文件不包括推送通知权利
- java - 对于使用 slf4j/logstash 的 Jboss 应用程序,如何以编程方式从 Java 中的特定类文件中查找特定日志消息?
- sql - SQL:如何创建具有时间条件的表?