首页 > 解决方案 > 根据 ID 将 numpy 行转换为列

问题描述

假设我有一个numpy在两种项目类型的 ID 之间映射的数组:

[[1, 12],
 [1, 13],
 [1, 14],
 [2, 13],
 [2, 14],
 [3, 11]]

我想重新排列这个数组,使新数组中的每一行代表与原始数组中相同 ID 匹配的所有项目。在这里,每一列将代表原始数组中的一个映射,直到对新数组中的列数有一个指定的形状限制。如果我们想从上面的数组中获得这个结果,确保我们只有 2 列,我们将获得:

[[12, 13],  #Represents 1 - 14 was not kept as only 2 columns are allowed
 [13, 14],  #Represents 2
 [11,  0]]  #Represents 3 - 0 was used as padding since 3 did not have 2 mappings

这里最简单的方法是使用 for 循环来填充新数组,因为它遇到原始数组中的行。有没有更有效的方法来实现这个numpy功能?

标签: pythonarraysnumpyscipy

解决方案


这是一种通用且主要是 Numpythonic 的方法:

In [144]: def array_packer(arr):
     ...:     cols = arr.shape[1]
     ...:     ids = arr[:, 0]
     ...:     inds = np.where(np.diff(ids) != 0)[0] + 1
     ...:     sp = np.split(arr[:,1:], inds)
     ...:     result = [np.unique(a[: cols]) if a.shape[0] >= cols else
     ...:                    np.pad(np.unique(a), (0, (cols - 1) * (cols - a.shape[0])), 'constant')
     ...:                 for a in sp]
     ...:     return result
     ...:     
     ...:     

演示:

In [145]: a = np.array([[1, 12, 15, 45],
     ...:  [1, 13, 23, 9],
     ...:  [1, 14, 14, 11],
     ...:  [2, 13, 90, 34],
     ...:  [2, 14, 23, 43],
     ...:  [3, 11, 123, 53]])
     ...:  

In [146]: array_packer(a)
Out[146]: 
[array([ 9, 11, 12, 13, 14, 15, 23, 45,  0,  0,  0]),
 array([13, 14, 23, 34, 43, 90,  0,  0,  0,  0,  0,  0]),
 array([ 11,  53, 123,   0,   0,   0,   0,   0,   0,   0,   0,   0])]

In [147]: a = np.array([[1, 12, 15],
     ...:  [1, 13, 23],
     ...:  [1, 14, 14],
     ...:  [2, 13, 90],
     ...:  [2, 14, 23],
     ...:  [3, 11, 123]])
     ...: 
     ...:   
     ...:  

In [148]: array_packer(a)
Out[148]: 
[array([12, 13, 14, 15, 23]),
 array([13, 14, 23, 90,  0,  0]),
 array([ 11, 123,   0,   0,   0,   0])]

推荐阅读