首页 > 解决方案 > Pytorch 错误:无法使用来自“CUDATensorId”后端的参数运行“aten::slow_conv3d_forward”

问题描述

我正在 CUDA GPU 上训练一个 CNN,它将 3D 医学图像作为输入并输出一个分类器。我怀疑pytorch中可能存在错误。我正在运行 pytorch 1.4.0。GPU 是“特斯拉 P100-PCIE-16GB”。当我在 CUDA 上运行模型时出现错误

Traceback (most recent call last):
  File "/home/ub/miniconda3/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3331, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-55-cc0dd3d9cbb7>", line 1, in <module>
    net(cc)
  File "/home/ub/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "<ipython-input-2-19e11966d1cd>", line 181, in forward
    out = self.layer1(x)
  File "/home/ub/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ub/miniconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/ub/miniconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ub/miniconda3/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 480, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: Could not run 'aten::slow_conv3d_forward' with arguments from the 'CUDATensorId' backend. 'aten::slow_conv3d_forward' is only available for these backends: [CPUTensorId, VariableTensorId].

要复制问题:

#input is a 64,64,64 3d image batch with 2 channels
class ConvNet(nn.Module):
    def __init__(self):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv3d(2, 32, kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool3d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv3d(32, 64, kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool3d(kernel_size=2, stride=2))
        self.drop_out = nn.Dropout()
        self.fc1 = nn.Linear(16 * 16*16 * 64, 1000)
        self.fc2 = nn.Linear(1000, 2)
        # self.softmax =  nn.LogSoftmax(dim=1)

    def forward(self, x):
        # print(out.shape)
        out = self.layer1(x)
        # print(out.shape)
        out = self.layer2(out)
        # print(out.shape)
        out = out.reshape(out.size(0), -1)
        # print(out.shape)
        out = self.drop_out(out)
        # print(out.shape)
        out = self.fc1(out)
        # print(out.shape)
        out = self.fc2(out)
        # out = self.softmax(out)
        # print(out.shape)
        return out


net = Convnet()
input = torch.randn(16, 2, 64, 64, 64)
net(input)

标签: debuggingpytorch

解决方案


最初,我认为错误消息表明'aten::slow_conv3d_forward'未使用 GPU (CUDA) 实现。但是看了你的网络之后,我觉得它没有意义,因为 Conv3D 是一个非常基本的操作,Pytorch 团队应该在 CUDA 中实现它。

然后我深入了一下源码,发现输入不是CUDA张量,导致问题。

这是一个工作示例:

import torch
from torch import nn

#input is a 64,64,64 3d image batch with 2 channels
class ConvNet(nn.Module):
    def __init__(self):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv3d(2, 32, kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool3d(kernel_size=2, stride=2))
        self.layer2 = nn.Sequential(
            nn.Conv3d(32, 64, kernel_size=5, stride=1, padding=2),
            nn.ReLU(),
            nn.MaxPool3d(kernel_size=2, stride=2))
        self.drop_out = nn.Dropout()
        self.fc1 = nn.Linear(16 * 16*16 * 64, 1000)
        self.fc2 = nn.Linear(1000, 2)
        # self.softmax =  nn.LogSoftmax(dim=1)

    def forward(self, x):
        # print(out.shape)
        out = self.layer1(x)
        # print(out.shape)
        out = self.layer2(out)
        # print(out.shape)
        out = out.reshape(out.size(0), -1)
        # print(out.shape)
        out = self.drop_out(out)
        # print(out.shape)
        out = self.fc1(out)
        # print(out.shape)
        out = self.fc2(out)
        # out = self.softmax(out)
        # print(out.shape)
        return out


net = ConvNet()
input = torch.randn(16, 2, 64, 64, 64)
net.cuda()
input = input.cuda() # IMPORTANT to reassign your tensor
net(input)

请记住,当您将模型从 CPU 放到 GPU 时,您可以直接调用.cuda(),但是如果您将张量从 CPU 放到 GPU 中,则需要重新分配它,例如tensor = tensor.cuda(),而不仅仅是调用tensor.cuda()。希望有帮助。

输出:

tensor([[-0.1588,  0.0680],
        [ 0.1514,  0.2078],
        [-0.2272, -0.2835],
        [-0.1105,  0.0585],
        [-0.2300,  0.2517],
        [-0.2497, -0.1019],
        [ 0.1357, -0.0475],
        [-0.0341, -0.3267],
        [-0.0207, -0.0451],
        [-0.4821, -0.0107],
        [-0.1779,  0.1247],
        [ 0.1281,  0.1830],
        [-0.0595, -0.1259],
        [-0.0545,  0.1838],
        [-0.0033, -0.1353],
        [ 0.0098, -0.0957]], device='cuda:0', grad_fn=<AddmmBackward>)


推荐阅读