python - 从具有最大并集的集合中找到最少集合的最快方法?
问题描述
给定一组独特的集合,我想找到具有最大联合的最小数量的集合,即宇宙。举个例子,假设我们有一组 20 个随机整数集,它们的大小从 1 到 10 不等:
import random
random.seed(99)
length = 20
ss = {frozenset(random.sample(range(100), random.randint(1,10))) for _ in range(length)}
assert len(ss) == 20 # This might be smaller than 20 if frozensets are not all unique
最大的联合(宇宙)由下式给出
universe = frozenset().union(*ss)
print(universe)
# frozenset({0, 6, 7, 10, 11, 12, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 25,
# 26, 27, 29, 31, 32, 34, 37, 39, 40, 42, 43, 45, 46, 47, 48, 49,
# 51, 52, 53, 54, 56, 59, 60, 62, 63, 64, 66, 67, 68, 69, 75, 76,
# 77, 78, 79, 80, 81, 84, 86, 87, 88, 89, 91, 92, 93, 95, 97, 98, 99})
现在我正在使用蛮力方法从 1 到 20 个子集的联合中搜索itertools.combinations
. 如下所示,代码在 2.95 秒后找到了最少 17 个子集。
from itertools import combinations
from time import time
t0 = time()
n = 1
res = []
found = False
while not found:
# Get all combinations of n subsets
all_n_ss = list(combinations(ss, n))
for n_ss in all_n_ss:
u = frozenset().union(*n_ss)
if u == universe:
res = n_ss
found = True
break
# Add one more subset
n += 1
print(len(res))
print(res)
print(time()-t0)
# 17
# (frozenset({0, 66, 7, 42, 48, 17, 81, 51, 25, 27}),
# frozenset({49, 27, 87, 47}),
# frozenset({76, 48, 17, 22, 25, 29, 31}),
# frozenset({14}),
# frozenset({0, 66, 68, 10, 46, 54, 25, 26, 59}),
# frozenset({75, 92, 53, 78}),
# frozenset({67, 68, 11, 79, 87, 89, 62}),
# frozenset({67, 99, 40, 10, 43, 11, 51, 86, 91, 60}),
# frozenset({6, 59, 91, 76, 45, 16, 20, 56, 27, 95}),
# frozenset({32, 98, 40, 46, 15, 86, 23, 29, 63}),
# frozenset({99, 37, 12, 77, 15, 18, 19, 52, 22, 95}),
# frozenset({39, 10, 11, 80, 18, 53, 54, 87}),
# frozenset({32, 93}),
# frozenset({34}),
# frozenset({64, 84, 22}),
# frozenset({32, 97, 69, 45, 16, 51, 88, 60}),
# frozenset({21}))
# 2.9506494998931885
然而,实际上我有一组 200 组,这对于暴力枚举是不可行的。我想要一种快速算法来找到一个 最佳解决方案。
解决方案
整数程序求解器非常擅长这一点。OR-Tools ( pip install ortools
) 中的示例代码:
import collections
from ortools.linear_solver import pywraplp
def set_cover(ss):
solver = pywraplp.Solver.CreateSolver("SCIP")
solver.Objective().SetMinimization()
constraints = collections.defaultdict(
lambda: solver.Constraint(1, solver.infinity())
)
variables = []
for s in ss:
x = solver.BoolVar(str(s))
solver.Objective().SetCoefficient(x, 1)
for e in s:
constraints[e].SetCoefficient(x, 1)
variables.append((s, x))
status = solver.Solve()
assert status == pywraplp.Solver.OPTIMAL
return {s for (s, x) in variables if x.solution_value()}
import random
def main():
random.seed(99)
length = 200
ss = {
frozenset(random.sample(range(100), random.randint(1, 10)))
for _ in range(length)
}
print(set_cover(ss))
if __name__ == "__main__":
main()
推荐阅读
- php - 如何将逗号标签从 mysql 显示到刀片视图
- python - 尽管在 mac 终端上使用 pip 命令安装它,但无法在 Python 上使用“请求”模块
- javascript - Websocket:如果脚本很忙,客户端上的消息会发生什么?
- typo3 - Typo3 & gridelements & flexform: Flexform-Elements并排
- docker - 在没有内部命令的情况下在 Jenkins 管道中运行多个更新的 Docker 容器
- sql - 甲骨文 | 如果未找到行,则输出计数 0
- flutter - 如何在播放队列中的下一个 mediaItem 时更改当前 mediaItem
- javascript - Etag 标头不起作用,每次返回状态 200
- r - 如何在R中创建一个返回具有固定参数的函数的函数?
- c# - 如何使用 dataverse api 获取 powerapps 应用程序?