pytorch - Pytorch cuda 错误发生(cublas_status_execution_failed)
问题描述
我正在训练python代码。
我正在使用 rtx 3080、wsl2 和 docker。
当我检查 docker 是否找到 gpu 时,它确实找到了。
代码要求是 pytorch 1.2.0 和 cuda 10.0
有什么方法可以运行此代码或修复错误?
Traceback (most recent call last):
File "oscar/run_captioning.py", line 884, in <module>
main()
File "oscar/run_captioning.py", line 863, in main
global_step, avg_loss = train(args, train_dataset, val_dataset, model, tokenizer)
File "oscar/run_captioning.py", line 434, in train
outputs = model(**inputs)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 440, in forward
return self.encode_forward(*args, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 448, in encode_forward
encoder_history_states=encoder_history_states)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 271, in forward
encoder_history_states=encoder_history_states)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 109, in forward
history_state)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 140, in forward
head_mask, history_state)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 82, in forward
self_outputs = self.self(input_tensor, attention_mask, head_mask, history_state)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/mnt/d/oscar/oscar/modeling/modeling_bert.py", line 36, in forward
mixed_query_layer = self.query(hidden_states)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
result = self.forward(*input, **kwargs)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "/home/apple/anaconda3/envs/py37/lib/python3.7/site-packages/torch/nn/functional.py", line 1371, in linear
output = input.matmul(weight.t())
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)`
解决方案
推荐阅读
- dart - 将字符串列表转换为 int Dart 列表
- firebase - 使用 Cloud Functions/Admin SDK 在 Google Firestore 和 Google 表格之间同步数据
- veracrypt - Veracrypt 设备未准备好
- python - Pandas Dataframe 中的高效搜索
- sql - 避免全表扫描
- dart - 如何使用flutter_google_places_autocomplete使用最新版本?
- python - 在 Python 中,使用 pandasql:查询返回“Empty DataFrame”
- javascript - 在 redux-saga 中让 yield.put 一个接一个地发生
- python - Python Beautifulsoup(bs4)findAll没有找到所有元素
- ruby-on-rails - Rails 树结构 API