首页 > 解决方案 > ValueError:不一致的形状:锯(1152、10、1、10、16)但预期(1152、10、1、16)

问题描述

我现在正在学习 capsnet,并尝试将代码从本地计算机传输到 colab。该代码在我的本地计算机上运行良好,但在 colab 上尝试时出现错误。ValueError:不一致的形状:锯 (1152, 10, 1, 10, 16) 但预期 (1152, 10, 1, 16)。

当我尝试像 [3,1] 这样的其他匹配时,我会收到以下错误。在这种情况下,x 的维度回到 4 并且 x[3] == y[2]。ValueError:无法对形状为 (1152, 10, 1, 8) 和 (1152, 10, 8, 16) 且轴 = [3, 1] 的输入执行 batch_dot。x.shape[3] != y.shape[1] (8 != 10)。

我在函数 tf.scan 上找到了这个错误的原因。我在我的电脑上安装了 tensorflow 1.13。但我不知道如何解决它。请帮我。

这是代码。

class CapsuleLayer(layers.Layer):

    def __init__(self, num_capsule, dim_vector, num_routing=3,
                 kernel_initializer='glorot_uniform',
                 bias_initializer='zeros',
                 **kwargs):
        super(CapsuleLayer, self).__init__(**kwargs)
        self.num_capsule = num_capsule
        self.dim_vector = dim_vector
        self.num_routing = num_routing
        self.kernel_initializer = initializers.get(kernel_initializer)
        self.bias_initializer = initializers.get(bias_initializer)

    def build(self, input_shape):
        assert len(input_shape) >= 3, "The input Tensor should have shape=[None, input_num_capsule, input_dim_vector]"
        self.input_num_capsule = input_shape[1]
        self.input_dim_vector = input_shape[2]

        # Transform matrix
        self.W = self.add_weight(shape=[self.input_num_capsule, self.num_capsule, self.input_dim_vector, self.dim_vector],
                                 initializer=self.kernel_initializer,
                                 name='W')
        print("the weight size in capsule layer", self.W)

        # Coupling coefficient. The redundant dimensions are just to facilitate subsequent matrix calculation.
        self.bias = self.add_weight(shape=[1, self.input_num_capsule, self.num_capsule, 1, 1],
                                    initializer=self.bias_initializer,
                                    name='bias',
                                    trainable=False)
        self.built = True

    def call(self, inputs, training=None):
        inputs_expand = K.expand_dims(K.expand_dims(inputs, 2), 2)

        inputs_tiled = K.tile(inputs_expand, [1, 1, self.num_capsule, 1, 1])
        print("call size inputs_tiled", inputs_tiled)

        # Compute `inputs * W` by scanning inputs_tiled on dimension 0. This is faster but requires Tensorflow.
        # inputs_hat.shape = [None, input_num_capsule, num_capsule, 1, dim_vector] [3, 2] [4,3]
        inputs_hat = tf.scan(lambda ac, x: K.batch_dot(x, self.W, axes=[3,2]),
                             elems=inputs_tiled,
                             initializer=K.zeros([self.input_num_capsule, self.num_capsule, 1, self.dim_vector]))
        print("result of inputs_hat", inputs_hat)

        # Routing algorithm V2. Use iteration. V2 and V1 both work without much difference on performance
        assert self.num_routing > 0, 'The num_routing should be > 0.'
        for i in range(self.num_routing):
            c = tf.nn.softmax(self.bias, dim=2)  # dim=2 is the num_capsule dimension
            # outputs.shape=[None, 1, num_capsule, 1, dim_vector]
            outputs = squash(K.sum(c * inputs_hat, 1, keepdims=True))
            print("size after squash:", outputs)

            # last iteration needs not compute bias which will not be passed to the graph any more anyway.
            if i != self.num_routing - 1:
                # self.bias = K.update_add(self.bias, K.sum(inputs_hat * outputs, [0, -1], keepdims=True))
                self.bias = tf.assign_add(self.bias, K.sum(inputs_hat * outputs, -1, keepdims=True))
                # self.bias = self.bias + K.sum(inputs_hat * outputs, -1, keepdims=True)
            # tf.summary.histogram('BigBee', self.bias)  # for debugging
        return K.reshape(outputs, [-1, self.num_capsule, self.dim_vector])

    def compute_output_shape(self, input_shape):
        print("the output shape of capslayer is:", tuple([None, self.num_capsule, self.dim_vector]))
        return tuple([None, self.num_capsule, self.dim_vector])

标签: pythontensorflowkerasconv-neural-network

解决方案


我在一台机器上遇到了同样的问题,而在另一台机器上没有。在比较环境后,我发现了 2 个差异,Tensorflow 为 2.1,Keras 在有错误的机器上为 2.3,工作环境分别为 1.15.0 和 2.2.4。

首先,我降级了 Tensorflow,但它没有用。

其次,我降级了 Keras,问题得到了解决。所以我的结论是 Keras 2.3 打破了这个功能。


推荐阅读