python - 了解卷积层的权重
问题描述
我正在尝试对磁共振图像进行语义分割,这是一个通道图像。
要从 U-Net 网络获取编码器,我使用此功能:
def get_encoder_unet(img_shape, k_init = 'glorot_uniform', bias_init='zeros'):
inp = Input(shape=img_shape)
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv1_1')(inp)
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv1_2')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool1')(conv1)
conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv2_1')(pool1)
conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv2_2')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool2')(conv2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv3_1')(pool2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv3_2')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool3')(conv3)
conv4 = Conv2D(256, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv4_1')(pool3)
conv4 = Conv2D(256, (4, 4), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv4_2')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool4')(conv4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv5_1')(pool4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv5_2')(conv5)
return conv5,conv4,conv3,conv2,conv1,inp
它的总结是:
Model: "encoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_1 (InputLayer) [(None, 200, 200, 1)] 0
_________________________________________________________________
conv1_1 (Conv2D) (None, 200, 200, 64) 1664
_________________________________________________________________
conv1_2 (Conv2D) (None, 200, 200, 64) 102464
_________________________________________________________________
pool1 (MaxPooling2D) (None, 100, 100, 64) 0
_________________________________________________________________
conv2_1 (Conv2D) (None, 100, 100, 96) 55392
_________________________________________________________________
conv2_2 (Conv2D) (None, 100, 100, 96) 83040
_________________________________________________________________
pool2 (MaxPooling2D) (None, 50, 50, 96) 0
_________________________________________________________________
conv3_1 (Conv2D) (None, 50, 50, 128) 110720
_________________________________________________________________
conv3_2 (Conv2D) (None, 50, 50, 128) 147584
_________________________________________________________________
pool3 (MaxPooling2D) (None, 25, 25, 128) 0
_________________________________________________________________
conv4_1 (Conv2D) (None, 25, 25, 256) 295168
_________________________________________________________________
conv4_2 (Conv2D) (None, 25, 25, 256) 1048832
_________________________________________________________________
pool4 (MaxPooling2D) (None, 12, 12, 256) 0
_________________________________________________________________
conv5_1 (Conv2D) (None, 12, 12, 512) 1180160
_________________________________________________________________
conv5_2 (Conv2D) (None, 12, 12, 512) 2359808
=================================================================
Total params: 5,384,832
Trainable params: 5,384,832
Non-trainable params: 0
_________________________________________________________________
我试图了解神经网络是如何工作的,我有这段代码来显示最后一层权重和偏差的形状。
layer_dict = dict([(layer.name, layer) for layer in model.layers])
layer_name = model.layers[-1].name
#layer_name = 'conv5_2'
filter_index = 0 # Which filter in this block would you like to visualise?
# Grab the filters and biases for that layer
filters, biases = layer_dict[layer_name].get_weights()
print("Filters")
print("\tType: ", type(filters))
print("\tShape: ", filters.shape)
print("Biases")
print("\tType: ", type(biases))
print("\tShape: ", biases.shape)
有了这个输出:
Filters
Type: <class 'numpy.ndarray'>
Shape: (3, 3, 512, 512)
Biases
Type: <class 'numpy.ndarray'>
Shape: (512,)
我试图理解是什么Filters' shape
意思(3, 3, 512, 512)
。我认为最后512
是filters
这一层的数量,但是什么(3, 3, 512)
意思?我的图像是一个通道,所以我不明白3, 3
过滤器的形状(img_shape
是(200, 200, 1)
)。
解决方案
我认为最后的 512 是该层中过滤器的数量,但是 (3, 3, 512) 是什么意思?
表示过滤器的整体大小:它们本身就是 3D。作为输入,conv5_2
您有 [batch, height', width', channels] 张量。在您的情况下,每个通道的过滤器大小为 3*3:您获取每个 3x3conv5_2
输入区域,对其应用 3x3 过滤器并获得 1 个值作为输出(参见动画)。但是对于每个通道(在您的情况下为 512),这些 3x3 过滤器都是不同的(请参阅此图以了解 1 个通道)。毕竟你想要执行 Conv2Dnumber_of_filter
时间,所以你需要 512 个大小为 3x3x512 的过滤器。
深入了解CNN 架构师和特别是 Conv2D 背后的直觉的好文章(见第 2 部分)
推荐阅读
- javascript - 如何连接到数组中的“索引”对象
- go - 获取未解析标识符的 importSpec
- css - 我们可以为 BEM 中的元素创建不同的修饰符吗
- php - 如何将数据从模型传递到 Laravel 中的控制器
- javascript - 根据极值点自动缩放
- java - 将参数从一个控制器传递到另一个控制器会产生 java.lang.NullPointerException
- node.js - 如何使用 DynamoDB.DocumentClient for NodeJS 指定主键?
- javascript - 是否可以使用 Javascript 遍历 img 标签的 id 属性?
- python - 如何停止在python中运行无限循环的线程?
- mysql - Blackbox SQLi:较大的“UNSIGNED BIGINT”时,MySql“SQL 语法错误”