python - Python评估csv文件中的重复元素
问题描述
我有 2 个 csv 文件:
CSV 1:
CHANNEL
3
3
4
1
2
1
4
5
CSV 2:
CHANNEL
1
2
2
3
4
4
4
5
我想通过查找重复通道来评估通道的状态。如果通道数 > 1,则状态为 0,否则状态为 1。
输出csv:
index channel 1 channel 2 channel 3 channel 4 channel 5
1 0 1 0 0 0
2 1 0 1 0 1
到目前为止,我已经计算了重复的频道,但仅针对 1 个文件。现在我不知道如何读取 2 个 csv 文件并创建输出文件。
import csv
import collections
with open("csvfile.csv") as f:
csv_data = csv.reader(f,delimiter=",")
next(csv_data)
count = collections.Counter()
for row in csv_data:
channel = row[0]
count[channel] += 1
for channel, nb in count.items():
if nb>1:
解决方案
您可以将每个文件读入一个列表,然后检查每个列表的通道数。
试试这个代码:
ss1 = '''
CHANNEL
3
3
4
1
2
1
4
5
'''.strip()
ss2 = '''
CHANNEL
1
2
2
3
4
4
4
5
'''.strip()
with open("csvfile1.csv",'w') as f: f.write(ss1) # write test file 1
with open("csvfile2.csv",'w') as f: f.write(ss2) # write test file 2
#############################
with open("csvfile1.csv") as f:
lines1 = f.readlines()[1:] # skip header
lines1 = [int(x) for x in lines1] # convert to ints
with open("csvfile2.csv") as f:
lines2 = f.readlines()[1:] # skip header
lines2 = [int(x) for x in lines2] # convert to ints
lines = [lines1,lines2] # make list for iteration
state = [[0]*5,[0]*5] # default zero for each state
for ci in [0,1]: # each file
for ch in range(5): # each channel
state[ci][ch] = 0 if lines[ci].count(ch+1) > 1 else 1 # check channel count, set state
# write to terminal
print('Index','Channel 1','Channel 2','Channel 3','Channel 4','Channel 5', sep = ' ')
print(' ',1,' ',' '.join(str(c) for c in state[0]))
print(' ',2,' ',' '.join(str(c) for c in state[1]))
# write to csv
with open('state.csv','w') as f:
f.write('Index,Channel 1,Channel 2,Channel 3,Channel 4,Channel 5\n')
f.write('1,' + ','.join(str(c) for c in state[0]) + '\n')
f.write('2,' + ','.join(str(c) for c in state[1]) + '\n')
输出(终端)
Index Channel 1 Channel 2 Channel 3 Channel 4 Channel 5
1 0 1 0 0 1
2 1 0 1 0 1
输出 (state.csv)
Index,Channel 1,Channel 2,Channel 3,Channel 4,Channel 5
1,0,1,0,0,1
2,1,0,1,0,1
推荐阅读
- javascript - 使用对象`Date()`,如何比较小时数?
- c++ - char的澄清&
- vim - 用于 Ultisnips+Deoplete 兼容性的 vimscript
- flutter - 在 Flutter 中将音频文件编码为 base64
- javascript - 未捕获的类型错误:无法读取 null 的属性“addEventListener”。上下文 _generated_background_page.html
- php - 需要在一个函数中使用多个变量
- jupyter-notebook - 在 jupyter + pycharm 集成中更改布局以在单元格之后显示预览
- python-3.x - “列表”列上的 Python Pandas groupby 数据
- java - 无法写入 MySQL 时连接重置
- c - 使用 fork() 按特定顺序创建进程树