首页 > 解决方案 > 将不同长度列表的 dict 值转换为一个列表,然后将该列表添加到 Dataframe

问题描述

我的问题是我上周提出的问题的后续。

我有作为list字典给出的数据。字典listint值是不同长度的值。他们在一个pandas DataFrame命名df_sim(列名rrintervals

startedat                   rrintervals
0   2020-02-27 15:06:35     [{'values': [727.0]}, {'values': [693.0, 688.0...
1   2020-02-27 15:06:22     [{'values': [1067.0]}, {'values': [921.0]}, {'...
2   2020-02-27 15:36:52     [{'values': [776.0]}, {'values': [826.0, 938.0..

IN:
print(df_sim.loc[0, "rrintervals"])

OUT:
[{'values': [727.0]}, {'values': [693.0, 688.0]}, {'values': [689.0]}, {'values': [699.0]}]

我想将列中的所有 dict 值rrintervals放入一个列表中,并且我想将其存储在df_sim名为的新列中rr_list

startedat                   rrintervals                                           rr_list
0   2020-02-27 15:06:35     [{'values': [727.0]}, {'values': [693.0, 688.0...     [727.0, 693.0, 688.0...]
1   2020-02-27 15:06:22     [{'values': [1067.0]}, {'values': [921.0]}, {'...     [1067.0, 921.0...]
2   2020-02-27 15:36:52     [{'values': [776.0]}, {'values': [826.0, 938.0..      [776.0, 826.0, 938.0...]

IN:
print(df_sim.loc[0, "rr_list"])

OUT:
[727.0, 693.0, 688.0, 689.0, 699.0]

我尝试应用我上一个问题的最佳答案,建议使用列表理解

for i in df_sim.index:
    df_sim.loc[i, "rr_list"] = [val for sub_dict in df_sim.loc[i, "rrintervals"] for val in sub_dict['values']]

但我一直得到一个ValueError

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-152-c50bd1585f57> in <module>
      1 for i in df_sim.index:
----> 2     df_sim.loc[i, "rr_list"] = [val for sub_dict in df_sim.loc[i, "rrintervals"] for val in sub_dict['values']]

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __setitem__(self, key, value)
    668             key = com.apply_if_callable(key, self.obj)
    669         indexer = self._get_setitem_indexer(key)
--> 670         self._setitem_with_indexer(indexer, value)
    671 
    672     def _validate_key(self, key, axis: int):

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _setitem_with_indexer(self, indexer, value)
   1015                     if len(labels) != len(value):
   1016                         raise ValueError(
-> 1017                             "Must have equal len keys and value "
   1018                             "when setting with an iterable"
   1019                         )

ValueError: Must have equal len keys and value when setting with an iterable

标签: pythondictionarylist-comprehension

解决方案


您的解决方案似乎很好。
如果你想要一个单行:

df['rr_list'] = df['rrintervals'].apply(lambda x: pd.DataFrame.from_records(x).sum())

推荐阅读