首页 > 解决方案 > Pytorch cuda 错误发生(cublas_status_execution_failed)

问题描述

我正在训练python代码。

我正在使用 rtx 3080、wsl2 和 docker。

当我检查 docker 是否找到 gpu 时,它确实找到了。

代码要求是 pytorch 1.2.0 和 cuda 10.0

有什么方法可以运行此代码或修复错误?

Traceback (most recent call last):
  File "oscar/run_captioning.py", line 884, in <module>
    main()
  File "oscar/run_captioning.py", line 863, in main
    global_step, avg_loss = train(args, train_dataset, val_dataset, model, tokenizer)
  File "oscar/run_captioning.py", line 434, in train
    outputs = model(**inputs)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 440, in forward
    return self.encode_forward(*args, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 448, in encode_forward
    encoder_history_states=encoder_history_states)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 271, in forward
    encoder_history_states=encoder_history_states)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 109, in forward
    history_state)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 140, in forward
    head_mask, history_state)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 82, in forward
    self_outputs = self.self(input_tensor, attention_mask, head_mask, history_state)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 36, in forward
    mixed_query_layer = self.query(hidden_states)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/functional.py", line 1371, in linear
    output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`

标签: pytorch

解决方案


推荐阅读