首页 > 解决方案 > 复杂字典的总和列表

问题描述

我有一个复杂数据列表,我想通过这些复杂数据合并到一个对象,date并对某些字段(例如 等)进行求和totalCostgross但我找不到一种好的快速方法来做到这一点。

[
  {
    "name": "Period 40",
    "metadata": {
      "payPeriod": "Weekly",
      "startDate": "2020-01-03",
      "totalCost": 4779.27,
      "gross": 4798.81
    }
  },
  {
    "name": "Period 40",
    "metadata": {
      "payPeriod": "Weekly",
      "startDate": "2020-01-03",
      "totalCost": 2857.88,
      "gross": 2918.66
    }
  }
]

我尝试的是对这些数据进行排序,itertools.groupby然后手动创建新的数据对象并用sum分组列表填充数据。

sorted_pay_runs = sorted(pay_runs,
                         key=lambda obj: obj['metadata']['startDate'],
                         reverse=True)

merged_pay_runs = []
for start_date, pay_run_data in itertools.groupby(
        sorted_pay_runs, lambda obj: obj['metadata']['startDate']):
    pay_run_data = list(pay_run_data)
    merged_obj = pay_run_data[0]
    merged_obj['metadata']['totalCost'] = sum(item['metadata']['totalCost'] for item in pay_run_data)

    merged_pay_runs.append(merged_obj)

标签: python-2.7dictionary

解决方案


您必须根据名称'Period 40'和内部日期的组合键进行分组,'metadata'以便将所有'Period 40'相同的日期汇总为一个 - 相同的'Period 41'

这可以工作(改变数字以获得更容易的组和添加的数学证明):

from itertools import groupby

data = [{ "name": "Period 40", "metadata": {  "payPeriod": "Weekly",
      "startDate": "2020-01-03", "totalCost": 5, "gross": 8 } }, 

        {"name": "Period 41", "metadata": { "payPeriod": "Weekly",
      "startDate": "2020-01-03", "totalCost": 2.5, "gross": 3 } },

        { "name": "Period 41", "metadata": {  "payPeriod": "Weekly",
      "startDate": "2020-01-03", "totalCost": 99, "gross": 110 } }, 

        {"name": "Period 40", "metadata": { "payPeriod": "Weekly",
      "startDate": "2020-01-03", "totalCost": 10, "gross": 18 } }]

# groupby needs sorted data - use (name, startdate) to sort by
sorted_data = sorted(data, key=lambda x: (x["name"],x["metadata"]["startDate"]))

# groupby (name, startdate)  
grped = groupby( sorted_data , lambda x: (x["name"],x["metadata"]["startDate"]))

results = []

# key is the combined key (name, startdate), we need to reapply name in the result
for key,value in grped:
    first, *vv = value                # get the first inner grouped result into first
                                      # put the others into vv
    first.update({"name":key[0]})     # add the key back into the grouped thing
    for v in vv:             # add the remaining inner metadatas
        first["metadata"]["totalCost"] += v["metadata"]["totalCost"]
        first["metadata"]["gross"] += v["metadata"]["gross"] 
    results.append(first)


from pprint import pprint
pprint(results)

输出:

[{'metadata': {'gross': 26,
               'payPeriod': 'Weekly',
               'startDate': '2020-01-03',
               'totalCost': 15},
  'name': 'Period 40'},

 {'metadata': {'gross': 113,
               'payPeriod': 'Weekly',
               'startDate': '2020-01-03',
               'totalCost': 101.5}, 
  'name': 'Period 41'}]

HTH


推荐阅读