首页 > 解决方案 > cuda nvcc 使 __device__ 有条件

问题描述

我正在尝试将 cuda 后端添加到 20k loc c++ 表达式模板库。到目前为止,它工作得很好,但我完全沉浸在虚假的“警告:不允许__host__从函数调用函数”警告中。__host__ __device__

大部分代码可以这样总结:

template<class Impl>
struct Wrapper{
    Impl impl;
    // lots and lots of decorator code
    __host__ __device__ void call(){ impl.call();};
};


//Guaranteed to never ever be used on gpu.
struct ImplCPU{
    void call();
};
//Guaranteed to never ever be used on cpu.
struct ImplGPU{
    __host__ __device__ void call();//Actually only __device__, but needed to shut up the compiler as well
};

Wrapper<ImplCPU> wrapCPU;
Wrapper<ImplGPU> wrapGPU;

在所有情况下,Wrapper 中的 call() 都是微不足道的,而 wrapper 本身是一个相当复杂的野兽(只有包含元信息的主机函数)。条件编译不是一种选择,两条路径都旨在并排使用。

我离“--disable-warnings”还差一步,因为老实说,复制和维护 10k loc 可怕的模板魔法的成本超过了警告的好处。

我会非常高兴有一种方法可以根据实现是用于 gpu 还是 cpu 有条件地调用设备主机(因为 Impl 知道它的用途)

只是为了表明它是坏的。一个警告:

/home/user/Remora/include/remora/detail/matrix_expression_classes.hpp(859): warning: calling a __host__ function from a __host__ __device__ function is not allowed
          detected during:
            instantiation of "remora::matrix_matrix_prod<MatA, MatB>::size_type remora::matrix_matrix_prod<MatA, MatB>::size1() const [with MatA=remora::dense_triangular_proxy<const float, remora::row_major, remora::lower, remora::hip_tag>, MatB=remora::matrix<float, remora::column_major, remora::hip_tag>]" 
/home/user/Remora/include/remora/cpu/../assignment.hpp(258): here
            instantiation of "MatA &remora::assign(remora::matrix_expression<MatA, Device> &, const remora::matrix_expression<MatB, Device> &) [with MatA=remora::dense_matrix_adaptor<float, remora::row_major, remora::continuous_dense_tag, remora::hip_tag>, MatB=remora::matrix_matrix_prod<remora::dense_triangular_proxy<const float, remora::row_major, remora::lower, remora::hip_tag>, remora::matrix<float, remora::column_major, remora::hip_tag>>, Device=remora::hip_tag]" 
/home/user/Remora/include/remora/cpu/../assignment.hpp(646): here
            instantiation of "remora::noalias_proxy<C>::closure_type &remora::noalias_proxy<C>::operator=(const E &) [with C=remora::matrix<float, remora::row_major, remora::hip_tag>, E=remora::matrix_matrix_prod<remora::dense_triangular_proxy<const float, remora::row_major, remora::lower, remora::hip_tag>, remora::matrix<float, remora::column_major, remora::hip_tag>>]" 
/home/user/Remora/Test/hip_triangular_prod.cpp(325): here
            instantiation of "void Remora_hip_triangular_prod::triangular_prod_matrix_matrix_test(Orientation) [with Orientation=remora::row_major]" 
/home/user/Remora/Test/hip_triangular_prod.cpp(527): here

标签: cudanvcc

解决方案


这个问题实际上是 CUDA 语言扩展中相当不幸的缺陷。

处理这些警告的标准方法(在 Thrust 和类似的模板化 CUDA 库中)是通过 using#pragma hd_warning_disable或在更新的 CUDA(9.0 或更新版本)中禁用导致警告的函数/方法的警告#pragma nv_exec_check_disable

因此,在您的情况下,它将是:

template<class Impl>
struct Wrapper{
    Impl impl;
    // lots and lots of decorator code

      #pragma nv_exec_check_disable
    __host__ __device__ void call(){ impl.call();};
};

已经问过类似的问题


推荐阅读