python - 将字典列表分解为常用元素的字典
问题描述
这个问题很难用标题来解释。
我有一个庞大的字典列表,dict_list,大约 18k 长。它们中的一个键是“PROCESS”。这两个过程是“Etch”和“Depo”,每个过程都会重复一段时间,然后切换到另一个,然后返回。这些被称为“运行”。
我需要将类似的流程组合到一个列表中,直到流程发生变化,然后将该列表插入到“运行”字典中。这是一个更好的视觉解释:
dict_list = [{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},
{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},
{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},
{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},
{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},
{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},
{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},
{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},
{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},
{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},
{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},
{"PROCESS": "Etch"},{"PROCESS": "Etch"}]
基本上,如果我循环dict_list
,逐行打印每个“PROCESS”,它看起来像:
>>"Etch"
>>"Etch"
>>"Etch"
>>"Etch"
>>"Depo"
>>"Depo"
>>"Depo"
>>"Depo"
>>"Etch"
>>"Etch"
>>"Etch"
>>"Etch"
>>"Depo"
>>"Depo"
>>"Depo"
>>"Depo"
对于该示例,我将有 4 个“运行”字典,每个字典都有一个包含 4 个字典的列表。
我需要将它们分组到列表中,然后到这样的字典中:
new_dict_list = {
"run 1": [{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"}],
"run 2": [{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"}],
"run 3": [{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"}]
}
它会是这样的:
遍历每个字典
将第一个字典放在一个列表中,然后将该列表放入一个新字典中(我们称之为运行)
在下一次迭代中,如果 dictionary["PROCESS"] 相同,则将其存储到相同的列表和相同的字典中
如果 dictionary["PROCESS"] 发生变化,将当前字典存储在新列表中,然后存储到新字典中
我只是不确定如何将其放入 python 逻辑中。我还是新手。
这是我到目前为止所拥有的:
prev_process = ""
counter = 0
new_dict_list = {}
for dictionary in dict_list:
if dictionary["PROCESS"] != prev_process:
counter += 1
prev_process = dictionary["PROCESS"]
new_dict_list["run " + counter] = dictionary
我觉得那里应该有一个 while 循环,“while dictionary["PROCESS"] 保持不变,做一些事情”,但我不知道如何将它放入 python,或者如何突破(因为条件如果我像现在一样检查它,那将永远是正确的)。
解决方案
您可以使用itertools.groupby
:
import itertools
dict_list = [{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Depo"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"},{"PROCESS": "Etch"}]
new_d = {'run {}'.format(i):list(b) for i, [_, b] in enumerate(itertools.groupby(dict_list, key=lambda x:x["PROCESS"]), 1)}
输出:
{'run 1': [{'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}],
'run 2': [{'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}, {'PROCESS': 'Depo'}],
'run 3': [{'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}, {'PROCESS': 'Etch'}]
}
itertools.groupby
根据单个键对数据进行分类。在这种情况下,数据围绕'PROCESS'
键的值进行分组,从而生成嵌套列表,其中包含键值以及具有匹配键值的所有元素。创建自定义'run {number}'
键,enumerate
用于以干净的方式跟踪当前的迭代索引。
推荐阅读
- html - CSS height 属性调整 iframe 大小错误
- python - python pandas数据框将列重命名为多索引列
- c# - 可选地使用必填字段和 ASP.NET MVC 数据注释
- php - dyld:库未加载:/usr/local/opt/openssl/lib/libcrypto.1.0.0.dylib、php-fpm、php5.6
- vbscript - 对好的 VBScript IDE 的建议
- c++ - 使用 SQLite3 作为后端时,为什么 SOCI 数据库类型 REAL 是 std::string?
- sql-server - 还选择其他表中未配对的记录
- javascript - 如何在 python 中使用 selenium 和 javascript 跟踪鼠标事件和位置?
- arangodb - pyArango 与 Foxx 微服务
- c# - 对 Xml 格式函数进行单元测试