首页 > 解决方案 > Python - dict中列表的值作为csv文件中的多列

问题描述

我有嵌套的字典,它们有一个内部列表作为我试图推断到多个列名下的 csv 文件的值。字典看起来像这样:

{'signal': {'chest': {'ACC': array([[ 0.95539999, -0.222     , -0.55799997],
       [ 0.92579997, -0.2216    , -0.55379999],
       [ 0.90820003, -0.21960002, -0.53920001],
       ...,
       [ 0.87179995, -0.12379998, -0.30419999],
       [ 0.87300003, -0.12339997, -0.30260003],
       [ 0.87020004, -0.12199998, -0.30220002]]), 'ECG': array([[ 0.02142334],
       [ 0.02032471],
       [ 0.01652527],
       ...,

我已经编写了代码来展平字典,以便每个标题都是:signal_chest_ACC、signal_chest_ECG 等。虽然很难看。

我试图处理每个列表的值,以便它们出现在每一列下。但是,它在单个列中而不是在适当的键下输出所有值。如何访问数组的每个索引并将它们作为 csv 文件的单独行输出,以便每个键(列标题)下都有适当的值列表?

#!/usr/bin/env python2

import sys
import numpy
import cPickle
import pandas as pd
import csv
import itertools

#numpy.set_printoptions(threshold=sys.maxsize)
with (open('S2.pkl', 'rb')) as openfile:
    data = cPickle.load(openfile)

    for key, value in data['signal'].items():
        data['signal_{}'.format(key)] = value
    del data['signal']

    for key, value in data['signal_wrist'].items():
        data['signal_wrist_{}'.format(key)] = value
    del data['signal_wrist']

    for key, value in data['signal_chest'].items():
        data['signal_chest_{}'.format(key)] = value
    del data['signal_chest']

    keys = sorted(data.keys())

    with open('out-testx.csv', 'wb') as csv_file:
        w = csv.writer(csv_file, delimiter = "\t")
        w.writerow(keys)
        for key in keys:
            for item in data[key]:
                w.writerow([item])

示例输出:

signal_chest_ACC    signal_chest_ECG    ...
[ 0.95539999, -0.222     , -0.55799997]
[ 0.92579997, -0.2216    , -0.55379999]
[ 0.90820003, -0.21960002, -0.53920001]
...
[ 0.02142334]
[ 0.02142334]
[ 0.01652527]
...

期望的输出:

signal_chest_ACC    signal_chest_ECG    ...
[ 0.95539999, -0.222     , -0.55799997]    [ 0.02142334]
[ 0.92579997, -0.2216    , -0.55379999]    [ 0.02142334]
[ 0.90820003, -0.21960002, -0.53920001]    [ 0.01652527]
...

标签: pythonlistcsvdictionary

解决方案


据我了解,您想将不同心电图和 ACC 测量的嵌套字典转换为平面 csv 表。ACC、ECG 测量值是对应值的有序列表。我从您提供的代码开始,我做了一些更改并包含解释每个步骤的注释。

请注意,您以许多不同的方式获得相同的结果,但我选择了与您开始时接近的东西,并且我没有尝试编写有效的代码或 python 代码来确保答案是明确的。我想说的是,有更好和更清洁的方法来获得相同的结果,但我在这里试图最大限度地提高清晰度。

import cPickle
import csv

def fun():
 #load the pickled file and set the value to data
 with (open('S2.pkl', 'rb')) as openfile:
  data = cPickle.load(openfile)    

 #flatten the nested dictionary (depth 2) 
 for key0, value0 in data.items():
  for key1, value1 in data[key0].items():
   data['{}_{}'.format(key0, key1)] = value1
   for key2, value2 in data['{}_{}'.format(key0, key1)].items():
    data['{}_{}_{}'.format(key0, key1, key2)]=value2
   del data['{}_{}'.format(key0, key1)]
  del data[key0]

 #extract the flatten keys (useful for the csv)
 keys = sorted(data.keys())

 #Turn the dictionary into a table
 rows = []
 firstKey = keys[0]
 for index, value in enumerate(data[firstKey]):
  row = []
  for key in keys:
   row.append(data[key][index])
  rows.append(row)

 #Export datRows as csv
 with open('out-testx.csv', 'wb') as csv_file:
  w = csv.writer(csv_file, delimiter = "\t", lineterminator="\n")
  w.writerow(keys)
  for row in rows:
   w.writerow(row) 

if __name__=='__main__':
 fun()

运行代码(使用可用数据)会导致:

signal_chest_ACC    signal_chest_ECG
[0.95539999, -0.222, -0.55799997]   [0.02142334]
[0.92579997, -0.2216, -0.55379999]  [0.02032471]
[0.90820003, -0.21960002, -0.53920001]  [0.02032471]
[0.87179995, -0.12379998, -0.30419999]  [0.02032471]
[0.87300003, -0.12339997, -0.30260003]  [0.02032471]
[0.87020004, -0.12199998, -0.30220002]  [0.01652527]

祝您分析数据顺利...


推荐阅读