从卷积层了解权重



我正在尝试对磁共振图像进行语义分割,这是一种单通道图像。

要从U-Net网络获得编码器,我使用以下功能:

def get_encoder_unet(img_shape, k_init = 'glorot_uniform', bias_init='zeros'):
inp = Input(shape=img_shape)
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv1_1')(inp)
conv1 = Conv2D(64, (5, 5), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv1_2')(conv1)
pool1 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool1')(conv1)

conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv2_1')(pool1)
conv2 = Conv2D(96, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv2_2')(conv2)
pool2 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool2')(conv2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv3_1')(pool2)
conv3 = Conv2D(128, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv3_2')(conv3)
pool3 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool3')(conv3)
conv4 = Conv2D(256, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv4_1')(pool3)
conv4 = Conv2D(256, (4, 4), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv4_2')(conv4)
pool4 = MaxPooling2D(pool_size=(2, 2), data_format="channels_last", name='pool4')(conv4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv5_1')(pool4)
conv5 = Conv2D(512, (3, 3), activation='relu', padding='same', data_format="channels_last", kernel_initializer=k_init, bias_initializer=bias_init, name='conv5_2')(conv5)
return conv5,conv4,conv3,conv2,conv1,inp

其摘要是:

Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 200, 200, 1)]     0         
_________________________________________________________________
conv1_1 (Conv2D)             (None, 200, 200, 64)      1664      
_________________________________________________________________
conv1_2 (Conv2D)             (None, 200, 200, 64)      102464    
_________________________________________________________________
pool1 (MaxPooling2D)         (None, 100, 100, 64)      0         
_________________________________________________________________
conv2_1 (Conv2D)             (None, 100, 100, 96)      55392     
_________________________________________________________________
conv2_2 (Conv2D)             (None, 100, 100, 96)      83040     
_________________________________________________________________
pool2 (MaxPooling2D)         (None, 50, 50, 96)        0         
_________________________________________________________________
conv3_1 (Conv2D)             (None, 50, 50, 128)       110720    
_________________________________________________________________
conv3_2 (Conv2D)             (None, 50, 50, 128)       147584    
_________________________________________________________________
pool3 (MaxPooling2D)         (None, 25, 25, 128)       0         
_________________________________________________________________
conv4_1 (Conv2D)             (None, 25, 25, 256)       295168    
_________________________________________________________________
conv4_2 (Conv2D)             (None, 25, 25, 256)       1048832   
_________________________________________________________________
pool4 (MaxPooling2D)         (None, 12, 12, 256)       0         
_________________________________________________________________
conv5_1 (Conv2D)             (None, 12, 12, 512)       1180160   
_________________________________________________________________
conv5_2 (Conv2D)             (None, 12, 12, 512)       2359808   
=================================================================
Total params: 5,384,832
Trainable params: 5,384,832
Non-trainable params: 0
_________________________________________________________________

我试图了解神经网络是如何工作的,我有这个代码来显示最后一层权重和偏差的形状。

layer_dict = dict([(layer.name, layer) for layer in model.layers])
layer_name = model.layers[-1].name
#layer_name = 'conv5_2'
filter_index = 0 # Which filter in this block would you like to visualise?
# Grab the filters and biases for that layer
filters, biases = layer_dict[layer_name].get_weights()
print("Filters")
print("tType: ", type(filters))
print("tShape: ", filters.shape)
print("Biases")
print("tType: ", type(biases))
print("tShape: ", biases.shape)

具有此输出:

Filters
Type:  <class 'numpy.ndarray'>
Shape:  (3, 3, 512, 512)
Biases
Type:  <class 'numpy.ndarray'>
Shape:  (512,)

我试图理解Filters' shape的意思是(3, 3, 512, 512)。我认为最后的512是这一层中filters的数量,但(3, 3, 512)是什么意思我的图像是一个通道,所以我不理解过滤器形状中的3, 3(img_shape(200, 200, 1)(。

我认为最后512是这一层中的过滤器数量,但(3,35112(意味着什么?

表示过滤器的总体大小:它们本身就是3D的。作为conv5_2的输入,您有[批次,高度',宽度',通道]张量。在您的情况下,每个通道的过滤器大小为3*3:您获取conv5_2输入的每个3x3区域,对其应用3x3过滤器,并获得1个值作为输出(请参见动画(。但是这些3x3滤波器对于每个通道都是不同的(在您的情况下是512((请参见1个通道的插图(。毕竟,你想执行Conv2Dnumber_of_filter次,所以你需要512个大小为3x3x512的过滤器。
这是一篇深入了解CNN架构师和Conv2D背后直觉的好文章(请参阅第2部分(

最新更新