首页 > 解决方案 > 在python中重建集合和交集

问题描述

我有以下问题,最初我有3套:

A [1,2,3],B [2,3,4,5] and C [2,5,6,7] 

接下来我考虑两个两个集合的交集以及所有集合的交集

AB [2,3],
AC [2],
BC [2,5] and
ABC [2] (Full intersection)

现在我想要的是在以下条件下对我的集合进行新的重新排序: 1. 保留每个集合的基数。2. 保留所有可能的交叉点的基数。

例如我应该得到

A [3,4,7],
B [1,3,7,5] and
C [2,6,5,7]

请注意,A 和 B 的新交集(现在 [3,7])与前一个交集一样有 2 个元素,类似于交集 AC 、BC 和完全交集 ABC ,当然,A、B 和 C 的基数继续分别为 3、4 和 4。最后,我需要尽可能多地进行重组,我理解这取决于集合的基数和集合的总数。

标签: pythonsetintersectionrebuildpreserve

解决方案


您可以只生成原始元素集的所有可能排列,并将它们用作从原始元素到新配置的“映射”。例如,2345671将每个数字映射到下一个数字并环绕;这将创建集合:

A = {2, 3, 4}    # from {1, 2, 3}
B = {3, 4, 5, 6} # from {2, 3, 4, 5}
C = {3, 6, 7, 1} # from {2, 5, 6, 7} 

使用以下方法非常简单itertools

from itertools import permutations

def all_configurations(*sets):
    elements = list(set.union(*sets))
    for perm in permutations(elements):
        map = {old: new for old, new in zip(elements, perm)}
        new_sets = [{map[k] for k in old_set} for old_set in sets]
        yield new_sets


A = {1, 2, 3}
B = {2, 3, 4, 5}
C = {2, 5, 6, 7} 

confs = all_configurations(A, B, C)
for conf in confs:
    print(conf)
    # Or: Ax, Bx, Cx = conf

请注意我是如何使用yield语句的,这将一次生成每个新排列,而不是一次创建它们,因此您可以将其用于大量元素而不会占用内存。此外,所编写的函数适用于任意数量的输入集。

当然,这肯定会生成一些重复项(例如,在您的示例中,映射 6 到 7 和 7 到 6 不会改变任何内容)但它肯定也应该生成每个有效选项。一些示例输出:

[{2, 4, 6}, {3, 4, 5, 6}, {1, 3, 6, 7}]
[{4, 5, 7}, {1, 3, 4, 7}, {2, 3, 6, 7}]
[{3, 5, 6}, {1, 3, 5, 7}, {2, 3, 4, 7}]
[{1, 6, 7}, {1, 2, 4, 7}, {2, 3, 5, 7}]

编辑:为了获得固定数量的非重复安排,您可以更改原始代码以返回 a tupleoffrozensets而不是集合列表,这样整个事情就可以散列,因此您只能获得唯一性。然后,您可以将内容添加到输出集中,直到达到您想要的基数:

from itertools import permutations

def all_configurations(*sets):
    elements = list(set.union(*sets))
    for perm in permutations(elements):
        map = {old: new for old, new in zip(elements, perm)}
        new_sets = tuple(frozenset(map[k] for k in old_set) for old_set in sets)
        yield new_sets

def n_configurations(n, *sets):
    output = set()
    confs = all_configurations(*sets)
    for conf in confs:
        output.add(conf)
        if len(output) >= n:
            break
    return output


A = {1, 2, 3}
B = {2, 3, 4, 5}
C = {2, 5, 6, 7} 

confs = n_configurations(10, A, B, C)
for a, b, c in confs:
    print(a, b, c)

这会产生以下 10 种配置:

(frozenset([1, 2, 3]), frozenset([2, 3, 5, 6]), frozenset([2, 4, 6, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 4, 6]), frozenset([2, 5, 6, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 6, 7]), frozenset([2, 4, 5, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 4, 6]), frozenset([2, 4, 5, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 4, 5]), frozenset([2, 4, 6, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 5, 6]), frozenset([2, 4, 5, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 4, 7]), frozenset([2, 5, 6, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 4, 5]), frozenset([2, 5, 6, 7]))
(frozenset([1, 2, 3]), frozenset([2, 3, 4, 7]), frozenset([2, 4, 5, 6]))
(frozenset([1, 2, 3]), frozenset([2, 3, 5, 7]), frozenset([2, 4, 6, 7]))

推荐阅读