python - 从最大化最小距离的一组 3D 点中采样 N 个点

问题描述

(500, 3)假设我有 500 个由数组表示的随机 3D 点：

import numpy as np

np.random.seed(99)
points = np.random.uniform(0, 10, (500, 3))

现在我想n = 20从这 500 个点中采样点，使所有成对距离的最小值最大。我正在使用一种贪婪的方法来采样每次最大化最小距离的点。下面是我的 Python 实现：

from scipy.spatial import distance_matrix

def sample_n_points(points, n):
    sampled_points = [points[0]]
    remained_points = points[1:]
    n_sampled = 1

    while n_sampled < n:
        min_dists = distance_matrix(remained_points, sampled_points).min(axis=1)
        imax = np.argmax(min_dists)
        sampled_points.append(remained_points[imax])
        np.delete(remained_points, (imax), axis=0)
        n_sampled += 1

    return np.asarray(sampled_points)

print(sample_n_points(points, n=20))

输出：

[[6.72278559 4.88078399 8.25495174]
 [1.01317279 9.74063145 0.15102072]
 [5.21672436 0.39259574 0.1069965 ]
 [9.89383494 9.77095442 1.15681204]
 [0.77144184 9.99325146 9.8976312 ]
 [0.04558333 2.34842151 5.25634324]
 [9.58126175 0.57371576 5.01765991]
 [9.93010888 9.959526   9.18606297]
 [5.27648557 9.93960401 4.82093673]
 [2.97622499 0.46695721 9.90627399]
 [0.28351187 3.64220133 0.06793617]
 [6.27527665 5.58177254 0.3544929 ]
 [0.4861886  7.45547887 5.342708  ]
 [0.83203965 5.00400167 9.40102603]
 [5.21120971 2.89966623 4.24236342]
 [9.18165946 0.26450445 9.58031481]
 [5.47605481 9.4493094  9.94331621]
 [9.31058632 6.36970353 5.33362741]
 [9.47554604 2.31761252 1.53774694]
 [3.99460408 6.17908899 6.00786122]]

但是，通过使用此代码，不能保证最佳解决方案。我的代码最明显的“错误”是它总是从对第一个点进行采样开始。当然，我可以使用每个点作为起点运行我的代码，最后采用最大化最小距离的那个，但即使这样也不会给出最佳解决方案。这些点在开始时彼此相距甚远，但随着采样的更多点被迫彼此靠近。经过一番思考，我意识到这个问题本质上变成了

在一组最均匀分布的 3D 点中找到子集。

我想知道是否有任何算法可以找到最佳解决方案或相对快速地给出一个好的近似值？

编辑

此优化问题的决策问题版本将是：

给定距离阈值t，是否有可能找到 n 个点的子集，使得子集中的每对点至少相距t 。

从图形的角度来看，这可以解释为

在欧几里得图中找到一个独立集，如果成对距离d ( v1,v2 ) ≤ t ，则点v1, v2在它们之间有一条边。

如果我们能解决这个决策问题，那么优化问题也可以通过对阈值t进行二分搜索来解决。

标签： pythonarraysalgorithmperformancenumpy

解决方案

希望我已经了解您的要求。

从你的开始：

from scipy.spatial import distance_matrix
import numpy as np

np.random.seed(99)
points = np.random.uniform(0, 10, (500, 3))

你应该得到所有点之间的距离并按距离排序：

# get distances between all points
d = distance_matrix(points, points)
# zero the identical upper triangle
dt = np.tril(d)
# list the distances and their indexes
dtv = [(dt[i, j], i, j) for (i, j) in np.argwhere(dt > 0)]
# sort the list
dtvs = sorted(dtv, key=lambda x: x[0], reverse=True)

然后，您可以增长 aset以获得 20 个索引到有助于最大距离的点。

编辑以将结果限制为k唯一点索引。

kpoint_index = set()
k = 20
i = 0

for p in (j for i in dtvs for j in i[1:]):
    kpoint_index.add(p)
    if len(kpoint_index) == k:
        break

print("index to points:", kpoint_index)

给予：

index to points: {393, 11, 282, 415, 160, 302, 189, 319, 194, 453, 73, 74, 459, 335, 469, 221, 103, 232, 236, 383}

这运行得很快 - 但我没有计时。

python - 从最大化最小距离的一组 3D 点中采样 N 个点

问题描述

解决方案

推荐阅读