python - 在多进程中使用一张图运行 tensorflow
问题描述
我正在尝试用 5 个集成网络训练分类器。我决定用不同的批次来训练它们,所以我想创建多进程来节省我的时间。
这是我的算法设计:
import multiprocessing as mp
import tensorflow as tf
# create() function returns 5 optimizer for 5 network, i.e. len(opt_list) = 5
opt_list = create()
def sub_process(sess, opt, feed_batch):
sess.run(opt, feed_dict=feed_batch)
batch_list = []
for i in range(5):
batch = generate_batch(batch_size=100)
batch_list.append(batch)
for i in range(5):
p = mp.Process(target=sub_process, args=(sess, opt_list[i], batch_list[i]))
p.start()
for i in range(5):
p.join()
首先,我构建图并将每个网络部署在 5 个不同的设备上(我总共有 5 个 GPU)。
然后,我从数据集中抽取样本(例如,如果我想向一个网络提供 100 个图像,那么我将生成 500 个样本)
接下来,我使用 python3 包 multiprocessing 创建 5 个进程。每个进程在给定参数输入的情况下运行一个 sub_process 函数。
但是,当我运行代码时,出现以下错误
2018-08-14 18:13:56.776853: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.776940: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.776978: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.777004: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.830762: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.831239: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.831262: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.831285: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:56.902612: E tensorflow/core/grappler/clusters/utils.cc:82] Failed to get device properties, error code: 3
2018-08-14 18:13:57.654653: E tensorflow/stream_executor/cuda/cuda_driver.cc:1227] failed to enqueue async memcpy from host to device: CUDA_ERROR_NOT_INITIALIZED; GPU dst: 0x1085d87f000; host src: 0x1083783f700; size: 4=0x4
2018-08-14 18:13:57.660200: E tensorflow/stream_executor/cuda/cuda_driver.cc:1227] failed to enqueue async memcpy from host to device: CUDA_ERROR_NOT_INITIALIZED; GPU dst: 0x1085d87f000; host src: 0x1083783f700; size: 4=0x4
2018-08-14 18:13:57.758658: E tensorflow/stream_executor/cuda/cuda_driver.cc:1227] failed to enqueue async memcpy from host to device: CUDA_ERROR_NOT_INITIALIZED; GPU dst: 0x1085d87f000; host src: 0x1083783f700; size: 4=0x4
2018-08-14 18:13:57.808281: E tensorflow/stream_executor/cuda/cuda_driver.cc:1227] failed to enqueue async memcpy from host to device: CUDA_ERROR_NOT_INITIALIZED; GPU dst: 0x1085d87f000; host src: 0x1083783f700; size: 4=0x4
谁能告诉我为什么会出现这样的错误?我的代码应该改变什么才能得到我想要的?
谢谢!
解决方案
我建议看一下tf.contrib.distribute,它有一个很好的 API,可以从多个 GPU 中获得良好的性能。
推荐阅读
- macos - AWS CLI 与从本地 mac 连接的问题
- java - 删除重复值后如何删除数组中的零?
- ios - SDWebImage 似乎正在使用带有 .refreshcached 的 http GET 方法而不是 HEAD 方法
- python - 在 python 中,如果可能的话,如何调整黑色格式化程序?
- python - 在 Pytorch 中,复制模型的学习参数作为相同架构的第二个模型的初始化的最有效方法是什么?
- python - 从 v, (x, y) 数组中填充一个 numpy 数组
- c - (C 编程)while 循环仅按特定顺序工作
- nattable - 将内容从一个 nattable 复制到具有相同对象类型的另一个
- flutter - 颤振不同的轴单元
- r - 使用 rowwise() 对 R 中的数据帧中的列进行逐行操作有什么好的替代方法?