首页 > 解决方案 > 为什么这个微小的 Numba CUDA 内核无法运行?

问题描述

我有一个小内核,它演示了我遇到的问题:

import numpy as np
from numba import cuda, types


@cuda.jit(device=True, debug=True)
def mutate_genome(instruction_positions):
    return 0


@cuda.jit
def generate_mutants():
    instruction_positions = cuda.local.array(500, np.int64)

    mutate_genome(instruction_positions)


if __name__ == "__main__":
    generate_mutants[1, 1]()

本质上,它所做的只是分配一些 int32 类型的本地内存,并调用一个获取这些本地内存数组的函数。

但是当我用 cuda-memcheck 运行这段代码时:

cuda-memcheck python xtests.py

它失败了:

========= CUDA-MEMCHECK
Traceback (most recent call last):
  File "xtests.py", line 18, in <module>
    generate_mutants[1, 1]()
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 804, in __call__
    kernel = self.specialize(*args)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 815, in specialize
    kernel = self.compile(argtypes)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 831, in compile
    **self.targetoptions)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
    return func(*args, **kwargs)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 61, in compile_kernel
    cres = compile_cuda(pyfunc, types.void, args, debug=debug, inline=inline)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
    return func(*args, **kwargs)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/cuda/compiler.py", line 50, in compile_cuda
    locals={})
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 551, in compile_extra
    return pipeline.compile_extra(func)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 331, in compile_extra
    return self._compile_bytecode()
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 393, in _compile_bytecode
    return self._compile_core()
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 373, in _compile_core
    raise e
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler.py", line 364, in _compile_core
    pm.run(self.state)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 347, in run
    raise patched_exception
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 338, in run
    self._runPass(idx, pass_inst, state)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_lock.py", line 32, in _acquire_compile_lock
    return func(*args, **kwargs)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 302, in _runPass
    mutated |= check(pss.run_pass, internal_state)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/compiler_machinery.py", line 275, in check
    mangled = func(compiler_state)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typed_passes.py", line 95, in run_pass
    raise_errors=self._raise_errors)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typed_passes.py", line 67, in type_inference_stage
    infer.propagate(raise_errors=raise_errors)
  File "/home/stark/anaconda3/lib/python3.7/site-packages/numba/typeinfer.py", line 985, in propagate
    raise errors[0]
numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Internal error at <numba.typeinfer.CallConstraint object at 0x7f77bf6bb850>.
type object 'numpy.int64' has no attribute 'is_precise'
[1] During: resolving callee type: Function(<numba.cuda.compiler.DeviceFunctionTemplate object at 0x7f772a8e0210>)
[2] During: typing of call at xtests.py (14)

Enable logging at debug level for details.

File "xtests.py", line 14:
def generate_mutants():
    <source elided>

    mutate_genome(instruction_positions)
    ^

我在 Linux Mint、Python 3.8、Numba 0.50 上。

谁能发现我做错了什么?

标签: pythonpython-3.xcudagpgpunumba

解决方案


我发现如果我在创建本地内存分配时使用numba.types.int64而不是,np.int64那么一切正常。

我猜那里不支持 numpy 类型。


推荐阅读