python - numpy - 2D 和 3D 数组中的有效值计数
问题描述
我正在为小组游戏编写一个调度程序。我的时间表适用于 32-4-8(32 名球员,每组 4 名球员,8 轮),没有重复的伙伴或对手。但是,由于场地限制,每轮只能有 28 名选手/7 组参赛。所以我必须修改赛程,让每个球员有 7 场比赛,1 次轮空,尽可能少的重复搭档或对手。
import numpy as np
sched = np.array([
[[ 3, 28, 17, 14],
[23, 30, 22, 1],
[ 2, 5, 27, 25],
[20, 8, 10, 16],
[ 0, 24, 26, 11],
[ 4, 21, 31, 7],
[19, 6, 29, 15],
[13, 18, 12, 9]],
[[20, 15, 24, 31],
[ 3, 21, 16, 13],
[ 6, 30, 4, 5],
[28, 8, 0, 7],
[25, 29, 17, 23],
[14, 9, 2, 22],
[27, 12, 1, 11],
[26, 10, 19, 18]],
[[10, 4, 23, 12],
[ 9, 28, 25, 31],
[ 5, 13, 22, 8],
[15, 7, 30, 2],
[16, 19, 11, 14],
[18, 17, 24, 6],
[21, 0, 27, 20],
[ 3, 26, 29, 1]],
[[18, 20, 28, 1],
[ 8, 9, 3, 4],
[12, 17, 31, 5],
[13, 30, 27, 14],
[19, 25, 24, 7],
[ 2, 6, 21, 26],
[10, 11, 29, 22],
[15, 23, 0, 16]],
[[22, 21, 25, 15],
[26, 12, 20, 14],
[28, 5, 24, 10],
[11, 6, 31, 13],
[23, 27, 7, 3],
[ 0, 19, 9, 1],
[18, 30, 8, 29],
[16, 17, 2, 4]],
[[29, 28, 12, 21],
[ 9, 16, 27, 6],
[19, 17, 20, 30],
[ 2, 8, 24, 23],
[ 5, 11, 18, 7],
[26, 13, 25, 4],
[ 1, 10, 15, 14],
[ 0, 22, 31, 3]],
[[31, 19, 27, 8],
[20, 5, 29, 2],
[24, 16, 22, 12],
[25, 3, 10, 6],
[17, 1, 7, 13],
[ 4, 0, 14, 18],
[23, 28, 26, 15],
[11, 21, 9, 30]],
[[31, 18, 1, 16],
[23, 14, 21, 5],
[ 8, 3, 11, 15],
[26, 17, 9, 10],
[30, 12, 25, 0],
[22, 20, 7, 6],
[27, 4, 29, 24],
[13, 19, 28, 2]]
])
为了确定最好的轮空选择,我从每一轮比赛中随机选择了一场比赛作为轮空。然后,我为每个轮空选择分配一个分数,以最大限度地增加只有 1 个轮空的球员数量,以最大限度地减少对赛程的必要更改。
def bincount2d(arr, bins=None):
if bins is None:
bins = np.max(arr) + 1
count = np.zeros(shape=[len(arr), bins], dtype=np.int64)
indexing = np.arange(len(arr))
for col in arr.T:
count[indexing, col] += 1
return count
# randomly sample one game per round as byes
# repeat n times (here 10000)
times = 10000
idx1 = np.tile(np.arange(sched.shape[0]), times)
idx2 = np.random.randint(sched.shape[1], size=sched.shape[0] * times)
population_byes = sched[idx1, idx2].reshape(times, sched.shape[1], sched.shape[2])
# get player counts for byes
# can reshape because interested in # of byes for entire schedule
# so no need to segment players by rounds for these counts
count_shape = (population_byes.shape[0], population_byes.shape[1] * population_byes.shape[2])
counts = bincount2d(population_byes.reshape(count_shape))
# fitness is the number of players with one bye
# the higher the value, the less we need to do to mess with the schedule
fitness = np.apply_along_axis(lambda x: (x == 1).sum(), 1, counts)
byes = population_byes[np.argmax(fitness)]
我的问题如下:
(1)有没有一种有效的方法来解释没有计数的值(我知道索引应该从 0 到 31)?bincount2d 没有该范围内缺失值的值。
(2) 是否有比 np.apply_along_axis 线矢量化/更有效的方法来获得等于 1 的元素计数?
(3) 最终,我想做的是让应用程序更改时间表,通过交换球员任务让每个人都再见。如何交换 3D 数组中的元素?
解决方案
(1)有没有一种有效的方法来解释没有计数的值(我知道索引应该从 0 到 31)?bincount2d 没有该范围内缺失值的值。
bincount2d
效率低下,因为它执行低效的内存访问。事实上,转置是一项昂贵的操作,尤其是当它像 Numpy 那样懒惰地完成时。此外,循环也效率不高,因为它适用于具有随机内存访问的相当大的数组,这对CPU 缓存不利。话虽如此,Numpy 并不适合这种计算。可以使用Numba高效地执行操作:
import numba as nb
# You may need to tune the types on your machines
# Alternatively, you can use cache=True instead and let Numba find the types (which is slower the fist time)
@nb.njit('int64[:,::1](int64[:,::1], optional(int64))')
def bincount2d_fast(arr, bins=None):
if bins is None:
nbins = np.max(arr) + 1
else:
nbins = np.int64(bins)
count = np.zeros((arr.shape[0], nbins), dtype=np.int64)
for i in range(arr.shape[0]):
for j in range(arr.shape[1]):
count[i, arr[i, j]] += 1
return count
bincount2d
上面的代码比我机器上的原始函数快 10 倍。
(2) 是否有比 np.apply_along_axis 线矢量化/更有效的方法来获得等于 1 的元素计数?
是的。您可以对整个二维数组进行操作并在给定的轴上执行归约。这是一个例子:
fitness = (counts == 1).sum(axis=1)
byes = population_byes[np.argmax(fitness)]
```
This is roughly 30 times faster on my machine.
> (3) Ultimately, what I would like to do is have the application change the schedule to give everyone a bye by swapping player assignments. How do you swap elements in a 3D array?
A straightforward solution is to use Numba again with plain loops. Another solution could be to save the value to swap in a temporary array and use an indirect access regarding your exact needs (like what @WholeBrain proposed). Something like:
```python
# all_x1, all_y1, etc. are 1D Numpy arrays containing coordinates of the items to swap
arr[all_x2, all_y2], arr[all_x1, all_y1] = arr[all_x1, all_y1], arr[all_x2, all_y2]
```
推荐阅读
- c++ - 使用boost异步发送和接收自定义数据包?
- sql - 如何在 postgresql 中设计最优且可扩展的唯一 page_view 表?
- postgresql - 如何使用谓词变量从 UNIX 执行 psql?
- ios - Swift Firebase - 如何向 viewDidLoad 添加观察者,在 viewDidDisappear 中将其删除,然后在 viewWillAppear 中再次读取?
- asp.net - <%#..%> 和 <%#:..%> 之间的区别?(冒号)
- objective-c - 如何在 Xamarin.iOS 中使用 Vuforia?
- systemd - 使用不同的计时器延迟几个 systemd 服务的自动启动
- python-3.x - 绘制日期时比例不一致
- python - 如何生成相等的空间?
- asp.net - 如何从 vmware mac 访问我的主机 localhost