首页 > 解决方案 > 将for循环的输出写入python中的csv

问题描述

我正在打开一个名为的 csv Remarks_Drug.csv,其中包含连续列中的产品名称和映射 文件名。我正在对产品列进行一些操作,以删除字符后的所有字符串内容+。从+ 字符中剥离字符串后,我将结果存储在一个名为product_patterns.

现在我打开一个新的csv,我想将 for 循环的输出写入两列,第一列包含 ,product_patterns第二列包含对应的filenames.

我现在得到的输出只是output csv我正在寻找的最后一行。我认为我没有正确循环,以便每一行product_patterns和文件名都附加到output csv文件中。

有人可以帮我解决这个问题。

下面附上代码:

import csv


with open('Remarks_Drug.csv', newline='', encoding ='utf-8') as myFile:
    reader = csv.reader(myFile)
    for row in reader:
        product = row[0].lower()
        #print('K---'+ product)
        filename = row[1]
        product_patterns = ', '.join([i.split("+")[0].strip() for i in product.split(",")])


        #print(product_patterns, filename)

    with open ('drug_output100.csv', 'a') as csvfile:
        fieldnames = ['product_patterns', 'filename']
        print(fieldnames)
        writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
        print(writer)
        #writer.writeheader()
        writer.writerow({'product_patterns':product_patterns, 'filename':filename})

样本输入:

    Film-coated tablet + TERIFLUNOMIDE, 2011-07-18 - Received approval letter_EN.txt
    Film-coated tablet + VANDETANIB,             2013-12-14 RECD Eudralink_Caprelsa II-28 - RSI - 14.12.2017.txt
    Solution for injection + MenQuadTT, 395_EU001930-PIP01-16_2016-02-22.txt
    Solution for injection + INSULIN GLARGINE,  2017-11-4 Updated PR.txt
    Solution for injection + INSULIN GLARGINE + LIXISENATIDE,   2017 12 12 Email Approval Texts - SA1006-.txt

标签: pythonstringcsvfor-loopexport-to-csv

解决方案


import csv
import pandas as pd

with open('Remarks_Drug.csv', newline='', encoding ='utf-8') as myFile:
    reader = csv.reader(myFile)
    mydrug = []
    for row in reader:
        product = row[0].lower()
        #print('K---'+ product)
        filename = row[1]
        product_patterns = ', '.join([i.split("+")[0].strip() for i in product.split(",")])
        mydrug.append([product_patterns, filename])

#     print(mydrug)

    df = pd.DataFrame(mydrug, columns=['product_patterns', 'filename'])
    print(df)
    df.to_csv('drug_output100.csv', sep=',', index=False)

这利用了pandas库。如果您要处理大csv文件,使用pandas 将在性能和内存方面方便且高效。这只是上述问题的替代解决方案。


推荐阅读