首页 > 解决方案 > Numba 中的 Numpy 聚合函数 shims、打字和 np.sort()

问题描述

我正在使用 Numba (0.44) 和 Numpynopython模式。目前,Numba 不支持跨任意轴的 Numpy 聚合函数,它只支持在整个数组上计算这些聚合。鉴于这种情况,我决定尝试一下并创建一些垫片。

在代码中:

np.min(array) # This works with Numba 0.44
np.min(array, axis = 0) # This does not work with Numba 0.44 (no axis argument allowed)

这是一个垫片的示例,旨在重现np.min(array)

import numpy as np
import numba

@numba.jit(nopython = True)
def npmin (X, axis = -1):
    """
    Shim for broadcastable np.min(). 
    Allows np.min(array), np.min(array, axis = 0), and np.min(array, axis = 1)
    Note that the argument axis = -1 computes on the entire array.
    """
    if axis == 0:
        _min = np.sort(X.transpose())[:,0]
    elif axis == 1:
        _min = np.sort(X)[:,0]
    else:
        _min = np.sort(np.sort(X)[:,0])[0]
    return _min

如果没有 Numba,则 shim 会按预期工作并概括最多np.min()2D 数组的行为。请注意,我使用axis = -1它作为一种允许对整个数组求和的方法——类似于np.min(array)不带axis参数调用的行为。

不幸的是,一旦我将 Numba 加入其中,就会出现错误。这是跟踪:

Traceback (most recent call last):
  File "shims.py", line 81, in <module>
    _min = npmin(a)
  File "/usr/local/lib/python3.7/site-packages/numba/dispatcher.py", line 348, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/usr/local/lib/python3.7/site-packages/numba/dispatcher.py", line 315, in error_rewrite
    reraise(type(e), e, None)
  File "/usr/local/lib/python3.7/site-packages/numba/six.py", line 658, in reraise
    raise value.with_traceback(tb)
numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<function sort at 0x10abd5ea0>) with argument(s) of type(s): (array(int64, 2d, F))
 * parameterized
In definition 0:
    All templates rejected
This error is usually caused by passing an argument of a type that is unsupported by the named function.
[1] During: resolving callee type: Function(<function sort at 0x10abd5ea0>)
[2] During: typing of call at shims.py (27)


File "shims.py", line 27:
def npmin (X, axis = -1):
    <source elided>
    if axis == 0:
        _min = np.sort(X.transpose())[:,0]
        ^

This is not usually a problem with Numba itself but instead often caused by
the use of unsupported features or an issue in resolving types.

To see Python/NumPy features supported by the latest release of Numba visit:
http://numba.pydata.org/numba-doc/dev/reference/pysupported.html
and
http://numba.pydata.org/numba-doc/dev/reference/numpysupported.html

For more information about typing errors and how to debug them visit:
http://numba.pydata.org/numba-doc/latest/user/troubleshoot.html#my-code-doesn-t-compile

If you think your code should work with Numba, please report the error message
and traceback, along with a minimal reproducer at:
https://github.com/numba/numba/issues/new

我已经验证 Numba 0.44 支持我正在使用的所有函数及其各自的参数。当然,堆栈跟踪表明问题出在我的调用上np.sort(array),但我怀疑这可能是打字问题,因为该函数可以返回标量(不带轴参数)或二维数组(带轴参数)。

也就是说,我有几个问题:

标签: pythonnumpysortingtypesnumba

解决方案


这是二维数组的替代垫片:

@numba.jit(nopython=True)
def npmin2(X, axis=0):
    if axis == 0:
        _min = np.empty(X.shape[1])
        for i in range(X.shape[1]):
            _min[i] = np.min(X[:,i])
    elif axis == 1:
        _min = np.empty(X.shape[0])
        for i in range(X.shape[0]):
            _min[i] = np.min(X[i,:])

    return _min

尽管您必须为这种情况找出一种解决方法axis=-1,因为这将返回一个标量,而其他参数将返回数组,并且 Numba 将无法将返回类型“统一”为一致的东西。

至少在我的机器上,性能似乎与仅调用等效项大致相当np.min,有时np.min更快,有时npmin2胜出,具体取决于输入数组的大小和轴。


推荐阅读