首页 > 解决方案 > CNN 算法在 GPU 中进行并行处理时出错

问题描述

> Let's use 1 GPUs! Training for 5 epochs
> /opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/THCUNN/ClassNLLCriterion.cu:106:
> void cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *,
> Dtype *, long *, Dtype *, int, int, int, int, long) [with Dtype =
> float, Acctype = float]: block: [0,0,0], thread: [18,0,0] Assertion `t
> >= 0 && t < n_classes` failed. THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/THCUNN/generic/ClassNLLCriterion.cu
> line=110 error=59 : device-side assert triggered Traceback (most
> recent call last):   File "CNN_hair_prediction.py", line 321, in
> <module>
>     (train_model,loss_tr)=train()   File "CNN_hair_prediction.py", line 237, in train
>     loss = criterion(output, labels)   File "/home/jain.he/.conda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py",
> line 532, in __call__
>     result = self.forward(*input, **kwargs)   File "/home/jain.he/.conda/envs/py37/lib/python3.7/site-packages/torch/nn/modules/loss.py",
> line 916, in forward
>     ignore_index=self.ignore_index, reduction=self.reduction)   File "/home/jain.he/.conda/envs/py37/lib/python3.7/site-packages/torch/nn/functional.py",
> line 2021, in cross_entropy
>     return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)   File
> "/home/jain.he/.conda/envs/py37/lib/python3.7/site-packages/torch/nn/functional.py",
> line 1838, in nll_loss
>     ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index) 

RuntimeError:cuda 运行时错误(59):设备端断言在 > /opt/conda/conda-bld/pytorch_1579040055865/work/aten/src/THCUNN/generic/ClassNLLCriterion.cu:110 触发

我在运行 CNN_hair_prediction 的 Python 脚本时遇到了上述错误,并尝试了不同的解决方案来解决这个问题,但对我来说没有任何结果。

谁能帮我解决这个问题?

标签: pythonalgorithmparallel-processingconv-neural-networktorch

解决方案


推荐阅读