首页 > 解决方案 > tf.keras.Model.load_weights() 捕获 ResourceExhaustedError

问题描述

我有两个 ipynb 文件:train.ipynbpredict.ipynb. 我已经训练了一个带有拟合生成器(批量大小为 64)的模型,并在我尝试加载权重时train.ipynb捕获了 我在 tensorflow v1.9 和 tensorflow docker 图像中使用 keras。ResourceExhaustedErrorpredict.ipynb

# train.ipynb

def network():
    #[ A normal model]
    return model
model = network()
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(seq,shuffle=True,
                    epochs = 10, verbose=1
                   )
# save the model and weight after training
with open('model.json','w') as json_file:
    json_file.write(model.to_json())
model.save_weights('model.h5')
clear_session() # tried to clear the session here
# saved both successfully
# model.h5(131MB)

成功保存后,我可以将其加载回里面train.ipynb但是,当我在 中执行相同的操作时predict.ipynb,会发现错误。

# train.ipynb
with open('model.json','r') as json_file:
    test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# No error here

# predict.ipynb
with open('model.json','r') as json_file:
    test_model = model_from_json(json_file.read())
test_model.load_weights('model.h5')
# Got the following error
ResourceExhaustedError: OOM when allocating tensor with shape[28224,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

任何帮助表示赞赏!

标签: python-3.xtensorflowkerasresources

解决方案


您是否同时运行两个笔记本?您的 GPU 内存不足。尝试nvidia-smi在命令行中检查 GPU 的资源使用情况,但请注意默认情况下 TensorFlow 会占用所有可用的 GPU 内存。keras.backend.clear_session()也可以提供帮助。


推荐阅读