pytorch - PyTorch RuntimeError:CUDA 内存不足。尝试分配 14.12 GiB
问题描述
对于一个简单的全连接层模型,我遇到了 Cuda 内存不足错误。我试过torch.cuda.empty_cache()
和gc.collect()
. 另外,我删除了不必要的变量,del
并尝试减少批量大小。但是错误没有解决。此外,该错误仅出现在使用 1440 个测试图像进行评估的 SUN 数据集上。但是代码对于没有的 AWA2 数据集运行良好。测试图像是 7913。我在这里使用 google colab。我也用过 RTX 2060。这是代码片段,其中出现错误:
def euclidean_dist(x, y):
# x: N x D
# y: M x D
torch.cuda.empty_cache()
n = x.size(0)
m = y.size(0)
d = x.size(1)
assert d == y.size(1)
x = x.unsqueeze(1).expand(n, m, d)
y = y.unsqueeze(0).expand(n, m, d)
del n,m,d
return torch.pow(x - y, 2).sum(2)
def compute_accuracy(test_att, test_visual, test_id, test_label):
global s2v
s2v.eval()
with torch.no_grad():
test_att = Variable(torch.from_numpy(test_att).float().to(device))
test_visual = Variable(torch.from_numpy(test_visual).float().to(device))
outpre = s2v(test_att, test_visual)
del test_att, test_visual
outpre = torch.argmax(torch.softmax(outpre, dim=1), dim=1)
outpre = test_id[outpre.cpu().data.numpy()]
#compute averaged per class accuracy
test_label = np.squeeze(np.asarray(test_label))
test_label = test_label.astype("float32")
unique_labels = np.unique(test_label)
acc = 0
for l in unique_labels:
idx = np.nonzero(test_label == l)[0]
acc += accuracy_score(test_label[idx], outpre[idx])
acc = acc / unique_labels.shape[0]
return acc
错误是:
Traceback (most recent call last): File "GBU_new_v2.py", line 234, in <module>
acc_seen_gzsl = compute_accuracy(attribute, x_test_seen, np.arange(len(attribute)), test_label_seen) File "GBU_new_v2.py", line 111, in compute_accuracy
outpre = s2v(test_att, test_visual) File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in __call__
result = self.forward(*input, **kwargs) File "GBU_new_v2.py", line 80, in forward
a1 = euclidean_dist(feat, a1) File "GBU_new_v2.py", line 62, in euclidean_dist
return torch.pow(x - y, 2).sum(2)#.sqrt() # return: N x M RuntimeError: CUDA out of memory. Tried to allocate 14.12 GiB (GPU 0;
15.90 GiB total capacity; 14.19 GiB already allocated; 669.88 MiB free; 14.55 GiB reserved in total by PyTorch)
解决方案
似乎您只为训练定义了批次,而在测试期间您尝试同时处理整个测试集。
您应该将您的测试集拆分为更小的“批次”,并一次评估一个批次,最后将所有批次分数合并为模型的一个分数。