首页 > 解决方案 > 如何在不使用熊猫的情况下在 python 中加入 2 个 csv。

问题描述

一个 csv 文件有以下列

计数,持续时间,项目,ID

1,na,na,123
2,na,na,456
3,na,na,789

其他 csv 文件包含

xyz_id,xyz_images

123,1
123,2
123,3
123,4
123,56
123,7
123,8
456,9
456,12
456,23

条件是我不能使用 pandas,那么如何加入这 2 个 csv 文件?

所需的输出是

xyz_id,xyz_images,计数,持续时间,项目,id

123,1,1,na,na,123
123,2,1,na,na,123
123,3,1,na,na,123
123,4,1,na,na,123
123,56,1,na,na,123
123,7,1,na,na,123
123,8,1,na,na,123
456,9,2,na,na,456
456,12,2,na,na,456
456,23,2,na,na,456

动机是将两个 csv 的 id 连接在一起以组合在一个文件中。

with open('/home/user/Downloads/FW__Json_FIles/withoutpanda.csv') as f,open('/home/user/Downloads/FW__Json_FIles/forms.csv') as csvfile1:
reader1 = csv.reader(f,delimiter='|')
reader2=csv.reader(csvfile1,delimiter='|')
try:
    for row1 in reader1:
        print(row1[0])
    for row2 in reader2:
        print (row2[3])
except csv.Error as e:
    sys.exit('file {}, line {}: {}'.format(filename, reader.line_num, e))

在此之后,我无法检查如何根据这两个键作为 row1[0] 和 row2[3] 加入这两个文件

标签: pythoncsvjoin

解决方案


您可以为两个 CSV 创建一个列表列表,并使用 for 循环手动进行连接:

records1=[]
with open('csvfile1', 'r') as f:
    for line in f:
        records1.append(line.split(','))

records2=[]
with open('csvfile2', 'r') as f:
    for line in f:
        records2.append(line.split(','))


for (count, duration, items, id_) in records1:
    for (xyz_id, xyz_images) in records2:
        if id_ == xyz_id:
            print(xyz_id, xyz_images, count, duration, items, id_, sep=',')

印刷:

123,1,1,na,na,123
123,2,1,na,na,123
123,3,1,na,na,123
123,4,1,na,na,123
123,56,1,na,na,123
123,7,1,na,na,123
123,8,1,na,na,123
456,9,2,na,na,456
456,12,2,na,na,456
456,23,2,na,na,456

如果行数很高,并且性能成为问题,请考虑将数据索引到列表字典中,并用字典查找替换内部 for 循环。


如果您必须将所有这些列输出到 csv 文件中,请执行以下操作:

with open(sys.argv[1], "w") as of:
    writer=csv.writer(of,delimiter='|')
    for (count, duration, items, id_) in records1:
        for (xyz_id, xyz_images) in records2:
            if id_ == xyz_id:
                writer.writerow([xyz_id, xyz_images, count, duration, items, id_])

推荐阅读