python - 使用python替换和删除csv中的列
问题描述
这是我正在编写的代码
import csv
import openpyxl
def read_file(fn):
rows = []
with open(fn) as f:
reader = csv.reader(f, quotechar='"',delimiter=",")
for row in reader:
if row:
rows.append(row)
return rows
replace = {x[0]:x[1:] for x in read_file("replace.csv")}
delete = set( (row[0] for row in read_file("delete.csv")) )
result = []
input_file="input.csv"
with open(input_file) as f:
reader = csv.reader(f, quotechar='"')
for row in reader:
if row:
if row[7] in delete:
continue
elif row[7] in replace:
result.append(replace[row[7]])
else:
result.append(row)
with open ("done.csv", "w+", newline="") as f:
w = csv.writer(f,quotechar='"', delimiter= ",")
w.writerows(result)
这是我的文件:
输入.csv:
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13
"-","-","-","-","-","-","-","aaaaa","-","-","bbbbb","-",","
"-","-","-","-","-","-","-","ccccc","-","-","ddddd","-",","
"-","-","-","-","-","-","-","eeeee","-","-","fffff","-",","
这是一个 13 列的 csv。我只对第 8 和第 11 个字段感兴趣。
这是我的替换.csv:
"aaaaa","11111","22222"
删除.csv:
ccccc
所以我正在做的是将replace.csv的第一列(逐行)与input.csv的第8列进行比较,如果它们匹配,则将input.csv的第8列替换为replace.csv的第二列和第11列输入与 replace.csv 的第三列和 delete.csv 它逐行比较两个文件,如果找到匹配,则删除整行。如果 replace.csv 或 delete.csv 中不存在任何行,则按原样打印该行。所以我想要的输出是:
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13
"-","-","-","-","-","-","-",11111,"-","-",22222,"-",","
"-","-","-","-","-","-","-","eeeee","-","-","fffff","-",","
但是当我运行这段代码时,它给了我这样的输出:
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12,c13
11111,22222
我哪里错了?我正在尝试更改我之前发布的问题的程序。由于输入文件已更改,我正在尝试更改我的程序。 https://stackoverflow.com/a/54388144/9279313
解决方案
@anuj I think SafeDev's solution is optimal but if you don't want to go with pandas, just make little changes in your code.
for row in reader:
if row:
if row[7] in delete:
continue
elif row[7] in replace:
key = row[7]
row[7] = replace[key][0]
row[10]= replace[key][1]
result.append(row)
else:
result.append(row)
Hope this solves your issue.
推荐阅读
- python - 未加载库:@rpath/libmysqlclient.21.dylib 原因:找不到图像 Django 迁移错误使用 mysqlclient DB 驱动程序和 MySQL 8 与 macOS
- r - R 简单计算后另存为 NetCDF 文件
- hm-10 - 首次配对后的 hm-10 SHIELD 辅助命令
- amazon-web-services - 在 AWS 组织中部署 lambda 函数
- r - 如何将有序年龄类别转换为连续变量?
- reactjs - 我是反应的初学者。我想保存每个 LI 中的所有值,以及一些关于如何使用道具的一般准则
- javascript - [pde-processing] 文件如何与 [html-webpage] 一起使用
- c++ - 如何让函数具有不同的返回类型?C++
- amazon-web-services - AWS s3api 命令获取 lastmodified 大于 yyyy-mm-dd HH:MM:SS 的列表对象
- angular - 未定义角度标识符“playerType”。'' 不包含这样的成员