python - Python:使用 Numba 切片 2d NumPy 数组以生成 C 阶数组
问题描述
我已经开始使用 Numba,现在我正在尝试使用 Numba 加速算法。但是,我在使用 numpy.dot 操作时遇到了问题。问题是当我逐列切片一个二维字符串数组时,它会生成一个数组类型的数组([unichr x 100],1d,A)。我需要这种类型是数组([unichr x 100],1d,C),以便 numpy.where 产生数组类型(float64,1d,C)的数组。然后在 numpy.dot 操作中将此数组与另一个相同类型的数组一起使用。Numba 告诉我,我不喜欢数组具有不同的顺序 A 和 C 的事实。没有 Numba,该算法可以正常工作。
这是一个简短的例子来说明这个问题。
data_X = [['a1','b2','c1'],
['a1','b2','c2'],
['a2','b1','c3'],
['a1','b2','c1'],
['a2','b1','c3']]
data_Y = [1.0, 2.0, 3.0, 4.0, 5.0]
X = np.array(data_X, dtype='<U100')
Y = np.array(data_Y, dtype=np.float64)
@nb.jit(
nopython=True,
locals={
'X': nb.types.Array(nb.types.UnicodeCharSeq(100), 2, 'C'),
'Y': nb.types.Array(nb.float64, 1, 'C'),
}
)
def func(X, Y):
results = []
for i in range(X.shape[1]):
uniqs = np.unique(X[:,i])
for u in uniqs:
X_vars = np.where(X[:,i] == np.full_like(X[:,i], u), 1.0, 0.0)
results.append(np.dot(X_vars, Y))
return results
func(X, Y)
如果没有 Numba,我得到的答案是 [7.0, 8.0, 8.0, 7.0, 5.0, 2.0, 8.0]。使用 Numba 我收到以下错误:
<ipython-input-27-42fe2e73a7cd>:23: NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float64, 1d, A), array(float64, 1d, C))
results.append(np.dot(X_vars, Y))
Traceback (most recent call last):
File "C:\DataScience\lib\site-packages\numba\core\errors.py", line 745, in new_error_context
yield
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 273, in lower_block
self.lower_inst(inst)
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 370, in lower_inst
val = self.lower_assign(ty, inst)
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 544, in lower_assign
return self.lower_expr(ty, value)
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 1266, in lower_expr
res = self.context.special_ops[expr.op](self, expr)
File "C:\DataScience\lib\site-packages\numba\np\ufunc\array_exprs.py", line 397, in _lower_array_expr
context, builder, outer_sig, args, ExprKernel, explicit_output=False)
File "C:\DataScience\lib\site-packages\numba\np\npyimpl.py", line 327, in numpy_ufunc_kernel
output = _build_array(context, builder, ret_ty, sig.args, arguments)
File "C:\DataScience\lib\site-packages\numba\np\npyimpl.py", line 281, in _build_array
dest_shape_tup)
File "C:\DataScience\lib\site-packages\numba\np\arrayobj.py", line 3385, in _empty_nd_impl
arrtype.layout))
NotImplementedError: Don't know how to allocate array with layout 'A'.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<ipython-input-27-42fe2e73a7cd>", line 26, in <module>
func(X, Y)
File "C:\DataScience\lib\site-packages\numba\core\dispatcher.py", line 434, in _compile_for_args
raise e
File "C:\DataScience\lib\site-packages\numba\core\dispatcher.py", line 367, in _compile_for_args
return self.compile(tuple(argtypes))
File "C:\DataScience\lib\site-packages\numba\core\compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "C:\DataScience\lib\site-packages\numba\core\dispatcher.py", line 808, in compile
cres = self._compiler.compile(args, return_type)
File "C:\DataScience\lib\site-packages\numba\core\dispatcher.py", line 78, in compile
status, retval = self._compile_cached(args, return_type)
File "C:\DataScience\lib\site-packages\numba\core\dispatcher.py", line 92, in _compile_cached
retval = self._compile_core(args, return_type)
File "C:\DataScience\lib\site-packages\numba\core\dispatcher.py", line 110, in _compile_core
pipeline_class=self.pipeline_class)
File "C:\DataScience\lib\site-packages\numba\core\compiler.py", line 603, in compile_extra
return pipeline.compile_extra(func)
File "C:\DataScience\lib\site-packages\numba\core\compiler.py", line 339, in compile_extra
return self._compile_bytecode()
File "C:\DataScience\lib\site-packages\numba\core\compiler.py", line 401, in _compile_bytecode
return self._compile_core()
File "C:\DataScience\lib\site-packages\numba\core\compiler.py", line 381, in _compile_core
raise e
File "C:\DataScience\lib\site-packages\numba\core\compiler.py", line 372, in _compile_core
pm.run(self.state)
File "C:\DataScience\lib\site-packages\numba\core\compiler_machinery.py", line 341, in run
raise patched_exception
File "C:\DataScience\lib\site-packages\numba\core\compiler_machinery.py", line 332, in run
self._runPass(idx, pass_inst, state)
File "C:\DataScience\lib\site-packages\numba\core\compiler_lock.py", line 32, in _acquire_compile_lock
return func(*args, **kwargs)
File "C:\DataScience\lib\site-packages\numba\core\compiler_machinery.py", line 291, in _runPass
mutated |= check(pss.run_pass, internal_state)
File "C:\DataScience\lib\site-packages\numba\core\compiler_machinery.py", line 264, in check
mangled = func(compiler_state)
File "C:\DataScience\lib\site-packages\numba\core\typed_passes.py", line 442, in run_pass
NativeLowering().run_pass(state)
File "C:\DataScience\lib\site-packages\numba\core\typed_passes.py", line 370, in run_pass
lower.lower()
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 179, in lower
self.lower_normal_function(self.fndesc)
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 233, in lower_normal_function
entry_block_tail = self.lower_function_body()
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 259, in lower_function_body
self.lower_block(block)
File "C:\DataScience\lib\site-packages\numba\core\lowering.py", line 273, in lower_block
self.lower_inst(inst)
File "C:\DataScience\lib\contextlib.py", line 130, in __exit__
self.gen.throw(type, value, traceback)
File "C:\DataScience\lib\site-packages\numba\core\errors.py", line 752, in new_error_context
reraise(type(newerr), newerr, tb)
File "C:\DataScience\lib\site-packages\numba\core\utils.py", line 81, in reraise
raise value
LoweringError: Don't know how to allocate array with layout 'A'.
解决方案
首先,让我们弄清楚 numba 到底在哪里遇到了麻烦。如果我们将操作简化为:
def func(X, Y):
#results = []
for i in range(X.shape[1]):
uniqs = np.unique(X[:,i])
for u in uniqs:
X_vars = (X[:,i] == np.full_like(X[:,i], u))
#X_vars = np.where(X[:,i] == np.full_like(X[:,i], u), 1.0, 0.0)
#results.append(np.dot(X_vars, Y))
#return results
然后 numba 仍然继续抛出相同的错误:NotImplementedError: Don't know how to allocate array with layout 'A'
. 所以这就是问题所在。事实上,您可以通过更简单的操作重现相同的错误:
def func(X, Y):
for i in range(X.shape[1]):
X[:,i] == X[:,i]
堆栈跟踪提供了一个提示:问题在于连续性。这里X[:,i]
是一个视图,因此没有指定“C”/“F”连续性,这让 numba 发疯了。因此,一个简单的解决方案是添加一条额外的线并应用于np.ascontiguousarray
您的视图。深拷贝也可以。
@nb.jit(
nopython=True,
locals={
'X': nb.types.Array(nb.types.UnicodeCharSeq(100), 2, 'C'),
'Y': nb.types.Array(nb.float64, 1, 'C'),
}
)
def func(X, Y):
results = []
for i in range(X.shape[1]):
X_i = np.ascontiguousarray(X[:,i])
uniqs = np.unique(X_i)
for u in uniqs:
X_vars = np.where( X_i == np.full_like(X_i, u), 1.0, 0.0)
results.append(np.dot(X_vars, Y))
return results
推荐阅读
- rest - Spring MVC REST not null 约束不适用于缺少请求参数
- node.js - Node js无法连接到Redis Docker Centos 7
- php - 在 Json 数组中按产品价格 ASC 和 DESC 输出短
- spring-boot - Kotlin Spring Boot 单元测试 - 添加 @TestExecutionListeners 不会注入依赖项
- javascript - 正则表达式负前瞻,不包括完整块
- html - 在悬停/单击时更改图像(颜色)以及链接
- octave - Octave 声称我的 excel 文件中没有工作表
- hibernate - file.hb.xml 的 Class 标签中的表属性
- ios - 同一个对象的哈希值不同,Swift,Hashable
- python - 如何在 Python 3.7 中向 multiprocessing.connection.Client(..) 添加超时?