首页 > 解决方案 > 如何使用 pycuda 中的内存地址初始化 GPU 数组?

问题描述

我有一个 c++ 代码,可以在 GPU 内存中提供图像数组输出。我想使用 pycuda 进行进一步的处理和图像分析。我正在尝试将 GPU 数组设置为:

出于测试目的,我在 C++ 中创建了一个数组,如下所示:

        const int arraySize = 5;
        const int a[arraySize] = {  1,  2,  3,  4,  5 };
        int* dev_a = nullptr;
        cudaMalloc((void**)&dev_a, arraySize * sizeof(int));
        cudaMemcpy(dev_a, a, arraySize * sizeof(int), cudaMemcpyHostToDevice);
        printf("dev_a : %p\n", dev_a);

假设,在这里我得到 GPU 内存地址为“0x7f3454800000”。我正在使用这个地址来创建 GPUarray:

from pycuda.gpuarray import GPUArray
import numpy as np
import pycuda.autoinit
import pycuda.driver as drv
from pycuda.compiler import SourceModule
from pycuda.driver import PointerHolderBase
drv.init()


class Holder(PointerHolderBase):

    def __init__(self):
        super().__init__()
        self.gpudata = '0x7f5954800000'

    def get_pointer(self):
        return self.gpudata

    def __int__(self):
        return self.__index__()

    # without an __index__ method, arithmetic calls to the GPUArray backed by this pointer fail
    # not sure why, this needs to return some integer, apparently
    def __index__(self):
        return self.gpudata


array = GPUArray((1,5), dtype=np.int32, gpudata=Holder())

print(array.get())

当我运行代码时,出现以下错误:

File "array_test.py", line 43, in <module>
    print(array.get())
  File "/home/govindam/anaconda3/envs/tf_c2/lib/python3.6/site-packages/pycuda/gpuarray.py", line 305, in get
    _memcpy_discontig(ary, self, async_=async_, stream=stream)
  File "/home/govindam/anaconda3/envs/tf_c2/lib/python3.6/site-packages/pycuda/gpuarray.py", line 1309, in _memcpy_discontig
    drv.memcpy_dtoh(dst, src.gpudata)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned long long from this Python object of type st

如何在创建 GPUarray 时提供内存地址,以避免从 GPU 复制到 CPU 再复制到 GPU?

标签: pythonc++pointersmemory-addresspycuda

解决方案


推荐阅读