首页 > 解决方案 > 如何使用 PyOpenCl 将参数传递给内核

问题描述

我编写了一个 python 代码来生成 mandelbrot 设置为 ppm 文件,现在我正在尝试实现 pyopencl 以加快进程并比较运行时间,但我非常不知道 pyopencl 在某些方面是如何工作的以及所有的研究在这种情况下,我所做的对我没有帮助。所以我的内核函数如下所示:

__kernel void mandelbrot(__global const float* real, __global const float* imaginary,
                        __global const float* max_iterations, __global int* output) 
{
    int gid = get_global_id(0);

    float rx = *real;
    float iy = *imaginary;

    float x = 0.0f;
    float y = 0.0f;
    int iterations = 0;

    while( (iterations < max_iterations) &&  ((x*x) + (y*y) < 4.0f)) {
        float temp = x*x - y*y + real;
        y = 2.0 * x * y + imaginary;
        x = temp;
        iterations++;
    }
}

我的输入变量如下所示:

real_gpu = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf = np.float32(realVal))
imag_gpu = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf = np.float32(imagVal))
output = cl.Buffer(ctx, mf.WRITE_ONLY, width * height * np.dtype(np.float64).itemsize)

prg = cl.Program(ctx, string_parallelism).build()
mandelbrot = prg.mandelbrot
mandelbrot.set_scalar_arg_dtypes([np.float64, np.float64, np.float64, None])

globalrange = (width, height)
localrange = None
mandelbrot(queue, globalrange, localrange, real_gpu, imag_gpu, maxN, output)

运行我的代码时,会产生以下错误:

CompilerWarning: From-source build succeeded, but resulted in non-empty logs:
Build on <pyopencl.Device 'Pitcairn' on 'AMD Accelerated Parallel Processing' at 0x56229da25400> succeeded, but said:

"/tmp/OCL3291941T1.cl", line 13: warning: operand types are incompatible ("int"
          and "const __global float *")
      while( (iterations < max_iterations) &&  ((x*x) + (y*y) < 4.0f)) {
                         ^


  warn(text, CompilerWarning)
Traceback (most recent call last):
  File "/home/tei/tei2020/rodrigues17193tei/hpc2/pyopencl_mandelbrot/paralell_mandelbrot.py", line 93, in <module>
    main()
  File "/home/tei/tei2020/rodrigues17193tei/hpc2/pyopencl_mandelbrot/paralell_mandelbrot.py", line 71, in main
    mandelbrot(queue, globalrange, localrange, real_gpu, imag_gpu, maxN, output)
  File "<generated code>", line 12, in enqueue_knl_mandelbrot
RuntimeError: when processing arg#1 (1-based): Unable to cast Python instance to C++ type (compile in debug mode for details)

我需要对变量进行哪些更改,以便我的内核可以正确执行?

标签: pythonopenclpyopencl

解决方案


没有关于类型的信息,maxN但我认为这是int因为float没有意义。

问题是内核参数max_iterations需要__global const float*在主机端创建缓冲区。将其作为缓冲区传递也没有任何意义。

所以我建议将max_iterationstype 更改为int,如下所示:

kernel void mandelbrot(__global const float* real, __global const float* imaginary,
                        int max_iterations, __global int* output)
{
 .....
}

然后像这样将它传递给内核:

mandelbrot(queue, globalrange, localrange, real_gpu, imag_gpu, np.int32(maxN), output)

推荐阅读