首页 > 解决方案 > 根据条件将字典项拆分为更小的字典

问题描述

我有两个列表:一个包含部分交易,另一个包含其父交易:

partials = [1,2,3,4,5,6,7,8,9,10]
parents = ['a','b','c','d','a','d','f','c','c','a']

我将这些列表压缩到字典中:

transactions = zip(partials, parents)

如您所见,一些部分交易具有相同的父交易。

我需要将字典中的项目分组为更小的组(更小的字典?),这样在每个组中不会有超过一个事务属于一个父级。因此,例如,所有与父“a”的交易都需要在不同的组中结束。

我还需要尽可能少的组,因​​为在现实世界中,每个组都是手动上传的文件。

预期的输出将是这样的:

第 1 组将包含事务 1a、2b、3c、4d、7f、

第 2 组将包含事务 5a、6d、8c、

第 3 组将包含事务 9c、10a

我一直在为此挠头一段时间,并将不胜感激任何建议。到目前为止,我没有任何工作代码要发布。

标签: python

解决方案


这是一种方法:

def bin_unique(partials, parents):
    bins = []
    for (ptx,par) in zip(partials, parents):
        pair_assigned = False
        # Try to find an existing bin that doesn't contain the parent.
        for bin_contents in bins:
            if par not in bin_contents:
                bin_contents[par] = (ptx, par)
                pair_assigned = True
                break
        # If we haven't been able to assign the pair, create a new bin
        #   (with the pair as it's first entry)
        if not pair_assigned:
            bins.append({par: (ptx, par)})

    return bins

用法

partials = [1,2,3,4,5,6,7,8,9,10]
parents = ['a','b','c','d','a','d','f','c','c','a']
binned = bin_unique(partials, parents)

输出

# Print the list of all dicts
print(binned)
# [
#   {'a': (1, 'a'), 'b': (2, 'b'), 'c': (3, 'c'), 'd': (4, 'd'), 'f': (7, 'f')}, 
#   {'a': (5, 'a'), 'd': (6, 'd'), 'c': (8, 'c')}, 
#   {'c': (9, 'c'), 'a': (10, 'a')}
# ]

# You can access the bins via index
print(binned[0])            # {'a': (1, 'a'), 'b': (2, 'b'), 'c': (3, 'c'), 'd': (4, 'd'), 'f': (7, 'f')}
print(len(binned))          # 3

# Each bin is a dictionary, keyed by parent, but the values are the (partial, parent) pair
print(binned[0].keys())     # dict_keys(['a', 'b', 'c', 'd', 'f'])
print(binned[0].values())   # dict_values([(1, 'a'), (2, 'b'), (3, 'c'), (4, 'd'), (7, 'f')])

# To show that all the transactions exist
all_pairs = [pair for b in binned for pair in b.values()]
print(sorted(all_pairs) == sorted(zip(partials, parents)))  # True

推荐阅读