首页 > 解决方案 > 如何将列动态附加到 .csv 文件?

问题描述

假设我们有以下 csv 文件

文件 1.csv

#groups id  owner
abc id1 owner1
abc id2 owner1
bcx id1 owner2
cpa id3 owner1

以下脚本读取file1.csv,过滤第一列#groups,并添加额外字符

#!/bin/env python2
#!/usr/bin/python

import re
import csv
print "enter Path to orignal file"
GROUPS = raw_input() 
print "enter Path to modified file"
WORKING = raw_input() 

def filter_lines(f):
    """this generator funtion uses a regular expression
    to include only lines that have a `abc` at the start
    and NO `gep` throughout the record
    """
    filter_regex = r'^abc(?!gep).*'              
    for line in f:
        line = line.strip()
        m = re.match(filter_regex, line)
        if m:
            yield line           

pat = re.compile(r'^(abc)(?!.*gep.*)') #insert gep in any abc records that dont have gep            

#insert gep 
variable1 = 0  

with open(GROUPS, 'r') as f: 
    with open(WORKING, 'w') as data:
        #next(f)  # Skip over header in input file.
        #filter
        filter_generator = filter_lines(f)
        csv_reader = csv.reader(filter_generator)
        count = 0
        writer = csv.writer(data) #, quoting=csv.QUOTE_ALL
        for row in csv_reader:
            count += 1
            variable1 = (pat.sub('\\1gep_', row[0])) #modify all filtered records to include gep
            fields = [variable1]
            writer.writerow(fields)

print 'Filtered (abc at Start and NO gep) Rows Count = ' + str(count)

例如,abc将转向abc_gep,我们会将其写入另一个 csv 文件file2.csv

所以file2.csv现在只包含:

abc_gep
abc_gep

好的。

现在我想添加与file1.csv中的abc匹配的其余列

我怎么能这样做?

我尝试了以下

fields = [variable1,row[1],row[2]]

但这是对列进行硬编码而不是动态的。我正在寻找更像这样的东西:

fields = [variable1, row[i]]

本质上,这是我寻找file2.csv的结果:

abc_gep id1 owner1
abc_gep id2 owner1

标签: python-2.6

解决方案


推荐阅读