首页 > 解决方案 > Scraping does not export to Excel

问题描述

I am trying my first time to scrape the information from website and export it to excel file. However, not the whole information is scraped, nor the file is created for the export.

This is what I get in anaconda:

(base) C:\Windows\system32>firstwebscrape.py
brand:  []
product_name: ASRock Radeon RX 5700 XT DirectX 12 RX 5700 XT TAICHI X 8G OC+ Video Card
product_price: €446,99 

Here is the code

from bs4 import BeautifulSoup as soup

my_url = 'https://www.newegg.com/global/lt-en/Video-Cards-Video-Devices/Category/ID-38?Tpk=graphic%20card'

#opening up the connection grabbing the page
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()

#HTML parser
page_soup = soup(page_html, "html.parser")

#grabs all containers
containers = page_soup.findAll("div",{"class":"item-container"})

filename= "123.csv"
f = open(filename, "w")

headers = "brand, product_name, product_price\n"

f.write(headers)

for container in containers:
    brand = container.findAll("a",{"class":"title"})

    title_container = container.findAll("a",{"class":"item-title"})
    product_name = title_container[0].text

    price_container = container.findAll("li",{"class":"price-current"})
    product_price = price_container[0].text.strip()

print("brand: ", brand)
print("product_name: " + product_name)
print("product_price: " + product_price)

f.write(str(brand) + "," + product_name.replace(",", "|") + "," + product_price + "\n")

f.close()

标签: pythonpython-3.xweb-scrapingbeautifulsoup

解决方案


Your code runs fine. Just correct this in your loop:

for container in containers:
    brand = container.findAll("a",{"class":"title"})

    title_container = container.findAll("a",{"class":"item-title"})
    product_name = title_container[0].text

    price_container = container.findAll("li",{"class":"price-current"})
    product_price = price_container[0].text.strip()

    # these code lines have to be in your for loop!
    print("brand: ", brand)
    print("product_name: " + product_name)
    print("product_price: " + product_price)

    f.write(str(brand) + "," + product_name.replace(",", "|") + "," + product_price + "\n")

You want to print and save for every item in your iteration over containers. Otherwise only the last item gets saved to your CSV.


推荐阅读