首页 > 解决方案 > 使用 Numba guVectorize 标签调用 cuMemcpyDtoH 会导致 UNKNOWN_CUDA_ERROR

问题描述

我试图获得一个我编写的 python 双边过滤器,以便在我的 GPU 上工作,但我一直遇到错误,而且我得到了一个对我来说非常神秘的过滤器。当我运行我得到的代码时

Call to cuMemcpyDtoH results in UNKNOWN_CUDA_ERROR

根据其他帖子,它似乎与内存问题有关?但是由于我没有在 cuda 中编写代码或弄乱内存(我只是添加标签以使其在 GPU 上运行)我不确定解决此问题的最佳方法是什么。我是否将代码转换为错误地在 GPU 上运行?

import numpy as np
import cv2
import sys
import math
import cmath
import tqdm
from numba import jit, cuda, vectorize, guvectorize, float64, int64

sIntesity = 12.0
sSpace = 16.0
diameter = 100

@guvectorize([(float64[:,:], float64[:,:])],  '(n,m)->(n,m)',target='cuda',nopython =True)
def apply_filter(img, filteredImage):

    #imh, imw = img.shape[:2]
    imh = 600
    imw = 600
    hd = int((diameter - 1) / 2)

    for h in range(hd, imh - hd):
        for w in range(hd, imw - hd):
            Wp = 0
            filteredPixel = 0
            radius = diameter // 2
            for x in range(0, diameter):
                for y in range(0, diameter):

                    currentX = w - (radius - x)
                    cureentY = h - (radius - y)

                    intensityDifferent = img[currentX][cureentY] - img[w][h]
                    intensity = (1.0/ (2 * math.pi * (sIntesity ** 2))* math.exp(-(intensityDifferent ** 2) / (2 * sIntesity ** 2)))
                    foo = (currentX - w) ** 2 + (cureentY - h) ** 2
                    distance = cmath.sqrt(foo)
                    smoothing = (1.0 / (2 * math.pi * (sSpace ** 2))) * math.exp( -(distance.real ** 2) / (2 * sSpace ** 2))
                    weight = intensity * smoothing
                    filteredPixel += img[currentX][cureentY] * weight
                    Wp += weight

            filteredImage[h][w] = int(round(filteredPixel / Wp))


if __name__ == "__main__":
    src = cv2.imread("messy2.png", cv2.IMREAD_GRAYSCALE)
    src = src.astype(float)
    filtered_image_own = np.zeros(src.shape)
    print(type(src),type(filtered_image_own))
    apply_filter(src, filtered_image_own)
    filtered_image_own = filtered_image_own.astype(np.uint8) 
    cv2.imwrite("filtered_image4.png", filtered_image_own)

标签: pythongpunumba

解决方案


将它从 CUDA 切换到 cpu 让我看到代码中有一个错误,它试图获取一个无效的索引,这个错误正是它告诉我有问题的方式


推荐阅读