首页 > 解决方案 > 批量标准化节点错误地相互链接

问题描述

我正在使用 BatchNormalization 层训练 Keras 网络,在查看 TensorBoard 图时看到了一个奇怪的东西。我的网络由一堆 1D 卷积和 BatchNormalization 层组成。大多数图表看起来都很好,但是根据 TensorBoard,第一个 BatchNormalization 层是向所有其他 BatchNormalization 层发送信息。这是正常的吗?

这是根据 Keras 的网络输出model.summary()

| Layer (type)                    | Output Shape      | Param # | Connected to        |
|---------------------------------|-------------------|---------|---------------------|
| pt_cloud_0 (InputLayer)         | (None, None, 39)  | 0       |                     |
| pt_cloud_1 (InputLayer)         | (None, None, 39)  | 0       |                     |
| conv1d_0_0 (Conv1D)             | (None, None, 64)  | 2560    | pt_cloud_0[0][0]    |
| conv1d_1_0 (Conv1D)             | (None, None, 64)  | 2560    | pt_cloud_1[0][0]    |
| batchnorm_0_0 (BatchNormalizati | (None, None, 64)  | 256     | conv1d_0_0[0][0]    |
| batchnorm_1_0 (BatchNormalizati | (None, None, 64)  | 256     | conv1d_1_0[0][0]    |
| conv1d_0_1 (Conv1D)             | (None, None, 64)  | 4160    | batchnorm_0_0[0][0] |
| conv1d_1_1 (Conv1D)             | (None, None, 64)  | 4160    | batchnorm_1_0[0][0] |
| batchnorm_0_1 (BatchNormalizati | (None, None, 64)  | 256     | conv1d_0_1[0][0]    |
| batchnorm_1_1 (BatchNormalizati | (None, None, 64)  | 256     | conv1d_1_1[0][0]    |
| conv1d_0_2 (Conv1D)             | (None, None, 316) | 20540   | batchnorm_0_1[0][0] |
| conv1d_1_2 (Conv1D)             | (None, None, 316) | 20540   | batchnorm_1_1[0][0] |
| batchnorm_0_2 (BatchNormalizati | (None, None, 316) | 1264    | conv1d_0_2[0][0]    |
| batchnorm_1_2 (BatchNormalizati | (None, None, 316) | 1264    | conv1d_1_2[0][0]    |
| conv1d_0_3 (Conv1D)             | (None, None, 316) | 100172  | batchnorm_0_2[0][0] |
| conv1d_1_3 (Conv1D)             | (None, None, 316) | 100172  | batchnorm_1_2[0][0] |
| aux_in (InputLayer)             | (None, 46)        | 0       | 0                   |
| batchnorm_0_3 (BatchNormalizati | (None, None, 316) | 1264    | conv1d_0_3[0][0]    |
| batchnorm_1_3 (BatchNormalizati | (None, None, 316) | 1264    | conv1d_1_3[0][0]    |
| aux_dense_0 (Dense)             | (None, 384)       | 18048   | aux_in[0][0]        |
| global_max_0 (GlobalMaxPooling1 | (None, 316)       | 0       | batchnorm_0_3[0][0] |
| global_max_1 (GlobalMaxPooling1 | (None, 316)       | 0       | batchnorm_1_3[0][0] |
| aux_dense_1 (Dense)             | (None, 384)       | 147840  | aux_dense_0[0][0]   |
| concatenate_1 (Concatenate)     | (None, 1016)      | 0       | global_max_0[0][0]  |
|                                 |                   |         | global_max_1[0][0]  |
|                                 |                   |         | aux_dense_1[0][0]   |
| dense_0 (Dense)                 | (None, 384)       | 390528  | concatenate_1[0][0] |
| dropout_0 (Dropout)             | (None, 384)       | 0       | dense_0[0][0]       |
| dense_1 (Dense)                 | (None, 384)       | 147840  | dropout_0[0][0]     |
| prediction (Dense)              | (None, 101)       | 38885   | dense_1[0][0]       |

这是 TensorBoard 中显示的图表(部分)图形 (如果图像不可见,请访问此链接: https ://imgur.com/a/G74uIWE ) 放大版:zoomed_graph或此链接:https://imgur。 com/a/vtF3VWb

红色轮廓层是我在网络中创建的第一个批量标准化层 (batchnorm_0_0)。我不太了解批标准化层的内部工作原理,但我觉得奇怪的是它与所有其他 BN 层相关联,而其他 BN 层则没有(它们只是连接到我分配的输入/输出他们)。我想知道这是否是我的代码、keras 或 TensorBoard 中的错误?

更新:模型代码如下;它的编写方式我可以轻松地尝试卷积层/过滤器的数量等......但应该是相当具有解释性的。

def _build(self, conv_filter_counts, dense_counts, dense_dropout_rates=None):
    """
    Builds the model. The model will have the following architecture:
      (1) [Per pointcloud] N 1D convolution layers (with possibly different depths) followed by BatchNormalization
                           layers.
      (2) [Per pointcloud] A global max pooling layer (calculating a 'global feature' of the point cloud).
      (3) [Once] M dense layers (with possibly different amounts of neurons), optionally followed by DropOut layers.
      (4) [Once] A final dense layer with `self.class_count` neurons and softmax activation.

    Arguments:
      conv_filter_counts: A list (length N) containing the succesive 1D convolution filter depths in (1)
      dense_counts: A list (length M) containing the amount of succesive neurons in (3)
      dense_dropout_rates: Optional. If specified, must be a list of length M containing the dropout rates
                           for each corresponding dense layer specified by `dense_counts`. Individual entries
                           can be set to None to disable dropout.
                           If not specified, dropout is applied nowhere.
    """
    inputs = [Input(shape=(None, self.pt_dim), name='pt_cloud_{}'.format(i)) for i in range(self.input_count)]
    if self.aux_input_count > 0:
        aux_input = Input(shape=(self.aux_input_count,), name='aux_in')

    if self.spatial_subnet:
        # Predict and apply spatial transform for each pointcloud.
        spatial_transforms = [transform_subnet(i, [64, 128, 256], [256, 64]) for i in inputs]
        inputs_tr = [apply_transform_layer(i, tr, self.pt_dim) for i, tr in zip(inputs, spatial_transforms)]
    else:
        inputs_tr = inputs

    global_feats = []
    for i, input_pts in enumerate(inputs_tr):
       x = input_pts

       # Convolution stack
       for j, c in enumerate(conv_filter_counts):
           x = Convolution1D(c, 1, activation='relu', name='conv1d_{}_{}'.format(i, j))(x)
           x = BatchNormalization(name='batchnorm_{}_{}'.format(i, j))(x)

       global_feats += [GlobalMaxPooling1D(name='global_max_{}'.format(i))(x)]

    # Concatenate features and possibly auxiliary input
    if self.aux_input_count > 0:
        x = aux_input

        # Create a dense subnetwork just for the auxiliary inpuy
        for i, (c, d) in enumerate(zip(dense_counts, dense_dropout_rates)):
            x = Dense(c, activation='relu', name='aux_dense_{}'.format(i))(x)

        x = Concatenate()(global_feats + [x])
    elif len(global_feats) > 1:
        x = Concatenate()(global_feats)
    else:
        x = global_feats[0]

    # Dense stack with optional dropout
    if dense_dropout_rates is None:
        dense_dropout_rates = [None] * len(dense_counts)

    for i, (c, d) in enumerate(zip(dense_counts, dense_dropout_rates)):
        x = Dense(c, activation='relu', name='dense_{}'.format(i))(x)
        if d is not None:
            x = Dropout(rate=d, name='dropout_{}'.format(i))(x)

    # Final prediction
    prediction = Dense(self.class_count, activation='softmax', name='prediction')(x)

    # Link all up in a model
    if self.aux_input_count > 0:
        inputs.append(aux_input)

    if len(inputs) == 1:
        inputs = inputs[0]

    return Model(inputs=inputs, outputs=prediction)

亲切的问候,

史蒂文

标签: kerastensorboard

解决方案


对我自己的问题@Mike 的谨慎回答,我认为(希望?)这确实是张量板方面的一个错误,因为我无法以其他方式解释它。

我使用它绘制了架构keras.utils.plot_model,这也没有显示 BatchNormalization 层之间的任何链接。


推荐阅读