首页 > 解决方案 > TensorFlow 加载的模型无法在 GPU 上运行

问题描述

我面临的问题是加载的模型未分配给 GPU。我在 ubuntu 中使用 tensorflow 2.0,这发生在多个平台上 1. Intel 平台 2. Jetson TX2

这是示例代码。

model = tf.keras.Sequential([...])
model.compile(...)
model.fit(...)

model.save("./mod.h5")

loaded_model = tf.keras.models.load_model("./mod.h5")

with tf.device("/GPU:0"):
    model.predict(...)  ## This will run on GPU

print("Start loaded model")
with tf.device("/GPU:0"):
    loaded_model.predict(...)   ## This version loads all the corresponding CUDA lib but runs on 
                                ## CPU with one thread

print("End loaded model")

这是我得到的一些日志,但启用了调试信息:

Start loaded model

... (Lots of GPU related log similar to the following)
2020-03-19 11:54:35.173142: I tensorflow/core/common_runtime/placer.cc:54] sequential/bidirectional/backward_lstm/while/body/_62/add_2/y: (Const): /job:localhost/replica:0/task:0/device:GPU:0
sequential/bidirectional/backward_lstm/while/body/_62/add_3/y: (Const): /job:localhost/replica:0/task:0/device:GPU:0
2020-03-19 11:54:35.173217: I tensorflow/core/common_runtime/placer.cc:54] sequential/bidirectional/backward_lstm/while/body/_62/add_3/y: (Const): /job:localhost/replica:0/task:0/device:GPU:0

(My comment: everything above is on GPU but actually runs on CPU. The GPU version takes 2 minutes but the CPU version takes 20 minutes)

2020-03-19 12:11:09.433786: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: End of sequence
         [[{{node IteratorGetNext}}]]
         [[sequential/embedding/embedding_lookup/_5]]
2020-03-19 12:11:09.434200: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Out of range: End of sequence
         [[{{node IteratorGetNext}}]]

End loaded model

我可以知道如何让loaded_modelGPU 上运行吗?保存和加载时缺少某些选项?

感谢您的时间和帮助!

标签: tensorflow

解决方案


推荐阅读