python-2.7 - 复杂字典的总和列表
问题描述
我有一个复杂数据列表,我想通过这些复杂数据合并到一个对象,date
并对某些字段(例如 等)进行求和totalCost
。gross
但我找不到一种好的快速方法来做到这一点。
[
{
"name": "Period 40",
"metadata": {
"payPeriod": "Weekly",
"startDate": "2020-01-03",
"totalCost": 4779.27,
"gross": 4798.81
}
},
{
"name": "Period 40",
"metadata": {
"payPeriod": "Weekly",
"startDate": "2020-01-03",
"totalCost": 2857.88,
"gross": 2918.66
}
}
]
我尝试的是对这些数据进行排序,itertools.groupby
然后手动创建新的数据对象并用sum
分组列表填充数据。
sorted_pay_runs = sorted(pay_runs,
key=lambda obj: obj['metadata']['startDate'],
reverse=True)
merged_pay_runs = []
for start_date, pay_run_data in itertools.groupby(
sorted_pay_runs, lambda obj: obj['metadata']['startDate']):
pay_run_data = list(pay_run_data)
merged_obj = pay_run_data[0]
merged_obj['metadata']['totalCost'] = sum(item['metadata']['totalCost'] for item in pay_run_data)
merged_pay_runs.append(merged_obj)
解决方案
您必须根据名称'Period 40'
和内部日期的组合键进行分组,'metadata'
以便将所有'Period 40'
相同的日期汇总为一个 - 相同的'Period 41'
这可以工作(改变数字以获得更容易的组和添加的数学证明):
from itertools import groupby
data = [{ "name": "Period 40", "metadata": { "payPeriod": "Weekly",
"startDate": "2020-01-03", "totalCost": 5, "gross": 8 } },
{"name": "Period 41", "metadata": { "payPeriod": "Weekly",
"startDate": "2020-01-03", "totalCost": 2.5, "gross": 3 } },
{ "name": "Period 41", "metadata": { "payPeriod": "Weekly",
"startDate": "2020-01-03", "totalCost": 99, "gross": 110 } },
{"name": "Period 40", "metadata": { "payPeriod": "Weekly",
"startDate": "2020-01-03", "totalCost": 10, "gross": 18 } }]
# groupby needs sorted data - use (name, startdate) to sort by
sorted_data = sorted(data, key=lambda x: (x["name"],x["metadata"]["startDate"]))
# groupby (name, startdate)
grped = groupby( sorted_data , lambda x: (x["name"],x["metadata"]["startDate"]))
results = []
# key is the combined key (name, startdate), we need to reapply name in the result
for key,value in grped:
first, *vv = value # get the first inner grouped result into first
# put the others into vv
first.update({"name":key[0]}) # add the key back into the grouped thing
for v in vv: # add the remaining inner metadatas
first["metadata"]["totalCost"] += v["metadata"]["totalCost"]
first["metadata"]["gross"] += v["metadata"]["gross"]
results.append(first)
from pprint import pprint
pprint(results)
输出:
[{'metadata': {'gross': 26,
'payPeriod': 'Weekly',
'startDate': '2020-01-03',
'totalCost': 15},
'name': 'Period 40'},
{'metadata': {'gross': 113,
'payPeriod': 'Weekly',
'startDate': '2020-01-03',
'totalCost': 101.5},
'name': 'Period 41'}]
HTH
推荐阅读
- python - 为什么要多次插入元素?
- java - 如何加载保存到我的标记 ID 的图像并显示在该特定标记的自定义信息窗口中?
- scrapy - 如何从:使用带有 css 选择器的 Scrapy 获取 Href
- json - Json 身份验证请求正文包含美元符号
- javascript - “Javascript - 权威指南” - 未捕获的 TypeError:trace(...) 不是函数
- networking - 当系统没有连接任何东西时,单个 DNS 查询会发生什么?
- angular - 角材料
默认值选择来自 - java - JPA 规范
加入查询 - apache-kafka - 无法进行 curl 并为 Snowflake Kafka 连接器创建 REST API
- c# - 使用 SortedList 的 C# 解决方案