首页 > 解决方案 > 使用scrapy管道写入文件

问题描述

我正在尝试使用scrapy pipelines.py 的文件item被正确解析,并在我运行时显示在终端中。这是我的 pipleines.py

import datetime,csv

class AmazonfullPipeline(object):
    keys = ["Product_Name","Price","Amazon_Stock","rating","ASIN","Rank1","Rank1_category","Rank2","Rank2_category",
    "UPC","Item_Model_Number"]

    def __init__(self):
        now = datetime.datetime.now()
        current_date = now.strftime("%d%b")
        file_name = "TestFile"
        infile = open("{}_{}.csv".format(current_date,file_name),"w").close()
        dict_writer = csv.DictWriter(infile, self.keys)
        dict_writer.writeheader()
    def process_item(self, item, spider):
        self.dict_writer.writerow(item)

错误信息:

dict_writer = csv.DictWriter(infile, self.keys)
  File "/usr/lib/python3.6/csv.py", line 140, in __init__
    self.writer = writer(f, dialect, *args, **kwds)
TypeError: argument 1 must have a "write" method

标签: pythonscrapy

解决方案


你有几个问题:

  1. 您在使用前关闭文件描述符;
  2. 您没有设置类变量。使用self.dict_writer,而不是dict_writer__init__

校验码:

import datetime,csv

class AmazonfullPipeline(object):
    keys = ["Product_Name","Price","Amazon_Stock","rating","ASIN","Rank1","Rank1_category","Rank2","Rank2_category",
    "UPC","Item_Model_Number"]

    def __init__(self):
        now = datetime.datetime.now()
        current_date = now.strftime("%d%b")
        file_name = "TestFile"
        infile = open("{}_{}.csv".format(current_date,file_name),"w")  # <- remove close() here
        self.dict_writer = csv.DictWriter(infile, self.keys)  # <- add self. here
        self.dict_writer.writeheader()  # <- add self. here

    def process_item(self, item, spider):
        self.dict_writer.writerow(item)

推荐阅读