首页 > 解决方案 > TensorFlow 检查失败:work_element_count > 0

问题描述

有人知道如何处理 Tensorflow 'work_element_count' 错误吗?

F ./tensorflow/core/util/cuda_launch_config.h:127] 检查失败:work_element_count > 0 (0 vs. 0) Aborted (core dumped)

这是我的源代码的一部分:

class DiscriminatorModel:
    def __init__(self, session, some_parameters):
        self.sess = session
        self.parameters = some_parameters

    def build_feed_dict(self, input_frames, gt_output_frames, generator):
        feed_dict = {}
        batch_size = np.shape(gt_output_frames)[0]
        print(batch_size) # 1

        print(np.shape(generator.input_frames_train))   # (?,7,32,32,32,1)
        print(np.shape(input_frames))                   # (1,7,32,32,32,1)
        print(np.shape(generator.gt_frames_train))      # (?,7,32,32,32,1)
        print(np.shape(gt_output_frames))               # (1,7,32,32,32,1)

        g_feed_dict={generator.input_frames_train:input_frames,
                     generator.gt_frames_train:gt_output_frames}

        def getshape(d):
            if isinstance(d, dict):
                return {k:getshape(d[k]) for k in d}
            else:
                return None
        print("g_feed_dict shape :", getshape(g_feed_dict),"\n")
        # {<tf.Tensor 'generator/data/Placeholder:0' shape=(?, 32, 32, 32, 1) dtype=float32>: None, <tf.Tensor 'generator/data/Placeholder_1:0' shape=(?, 32, 32, 32, 1) dtype=float32>: None}

        print(sys.getsizeof(generator.scale_preds_train))    # 96
        print(sys.getsizeof(g_feed_dict))                    # 288


        # error occurs here.
        g_scale_preds = self.sess.run(generator.scale_preds_train, feed_dict=g_feed_dict)
        # F ./tensorflow/core/util/cuda_launch_config.h:127] Check failed: work_element_count > 0 (0 vs. 0)
        # Aborted (core dumped)

    def train_step(self, batch, generator):
        print(np.shape(batch))    # [1, 7, 32, 32, 32, 2]
        input_frames = batch[:, :, :, :, :, :-1]
        gt_output_frames = batch[:, :, :, :, :, -1:]

        feed_dict = self.build_feed_dict(input_frames, gt_output_frames, generator)

class GeneratorModel:
    def __init__(self, session, some_parameters):
        self.sess = session
        self.parameters = some_parameters

        self.input_frames_train = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])
        self.gt_frames_train = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])

        self.input_frames_test = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])
        self.gt_frames_test = tf.placeholder(
            tf.float32, shape=[None, 7, 32, 32, 32, 1])

        self.scale_preds_train = []
        for p in range(4):
            # scale size, 4 --> 8 --> 16 --> 32
            sc = 4*(2**p)
            # this passes tf.Tensor array of shape (1,7,sc,sc,sc,1)
            train_preds = calculate(self.width_train,
                                    self.height_train,
                                    self.depth_train,
                                    ...)
            self.scale_preds_train.append(train_preds

        # [ <..Tensor shape=(1,7,4,4,4,1) ....>,
        #   <..Tensor shape=(1,7,8,8,8,1) ....>,
        #   <..Tensor shape=(1,7,16,16,16,1)..>,
        #   <..Tensor shape=(1,7,32,32,32,1)..> ]
        print(self.scale_preds_train)

sess = tf.Session()
d_model = DiscriminatorModel(sess, some_parameters)
g_model = GeneratorModel(sess, some_parameters)
sess.run(tf.global_variables_initializer())

# this returns numpy array of shape [1,7,32,32,32,2]
batch = get_batch()

# trouble here.
d_model.train_step(batch, g_model)

我看过一些关于以下方面的建议:

我在其中 5 个中使用单个 11GB gpu,指定为

~$ CUDA_VISIBLE_DEVICES=2 python3 foo.py

批量大小为 1。任何人都可以告诉我遗漏的点或我做错了什么吗?

编辑 1。

我发现了一个解决此错误的案例。如果我对输入进行一些修改,例如

# ... previous code does not change
print(sys.getsizeof(g_feed_dict))                    # 288
temp_index = 0
temp_input = [generator.scale_preds_train[temp_index],
              generator.scale_preds_train[temp_index],
              generator.scale_preds_train[temp_index],
              generator.scale_preds_train[temp_index]]
# this <temp_input> does not raise error here.
# however temp_index > 0 don't work.
g_scale_preds = self.sess.run(temp_input, feed_dict=g_feed_dict)

这使得输入传递给sess.run它的形状类似于

[(1,7,4,4,4,1), (1,7,4,4,4,1), (1,7,4,4,4,1), (1,7,4,4,4,1)]

这应该是(最初)缩放形状的列表,例如 [(1,7,4,4,4,1), (1,7,8,8,8,1), (1,7,16,16,16 ,1), (1,7,32,32,32,1)]。此外,字典中的数组feed_dict是 shape (1,7,32,32,32,1)

似乎错误来自 tensorflow-gpu 试图达到错误的数组索引(实际上未分配内存),因此“工作元素计数为 0”(但我还不确定)。

我不明白为什么temp_index > 0(eg 1, 2, 3) 确实会引发相同 的Check failed错误,而0这是唯一没有的形状。

编辑 2。

在我将我的 gpu 从 TITAN Xp 更改为 GeForce GTX 后,错误日志说

浮点异常(核心转储)

在相同的代码(sess.run)。

标签: python-3.xtensorflowdeep-learninggpu

解决方案


在我的例子中,其中一个卷积层有 0 个输出特征图,这导致了这个问题。


推荐阅读