为什么在我简单的CNN中有额外的参数



这是我的问题:当我在第一个块中查找参数的数量时,我看到36928个参数(这就是我所期望的)。但是,当我使用这个块在另一个类nn.Module中构建模型时,有1792个额外的参数,我不知道它们来自哪里。

我在下面放了一些代码来说明。

class Conv2dBlock(torch.nn.Module):
def __init__(self, in_filters, out_filters, kernel_size=3):
super(Conv2dBlock, self).__init__()
self.conv2d_seq = torch.nn.Sequential()
for k in range(2):
self.conv2d_seq.append(torch.nn.Conv2d(in_channels=in_filters, out_channels=out_filters, kernel_size=kernel_size, padding='same'))
self.conv2d_seq.append(torch.nn.ReLU())
in_filters = out_filters

def forward(self, input):
out = self.conv2d_seq(input)
return out

然后,我在另一个nn.Module:中使用这个块

class EncoderBlock(torch.nn.Module):
def __init__(self):
super(EncoderBlock, self).__init__()
self.conv2d = Conv2dBlock(3, 64)
self.maxpool = torch.nn.MaxPool2d(kernel_size=2)
def forward(self, input):
x = self.conv2d(input)
p = self.maxpool(x)
out = torch.nn.functional.dropout(p, 0.3)
return x, out

最后:

class UNet_model(torch.nn.Module): 
def __init__(self):
super(UNet_model, self).__init__()
self.encoder_block1 = EncoderBlock()

def forward(self, input):
p1 = self.encoder_block1(input)
# I removed useless code

return p1

model = UNet_model()
summary(model, (3,128,128))

最后一个类构造了一个具有38720个参数的模型,而不是36928个参数。似乎有一个额外的卷积层((3,64,(3,3))=1792 params)应用于输入两次。。。我不明白。

有人能看一下吗?

谢谢!

首先,torch.nn.Sequential()不支持append方法,应该改为add_module,如下所示:

for k in range(2):
self.conv2d_seq.add_module(f"conv_{k}",torch.nn.Conv2d(in_channels=in_filters, out_channels=out_filters, kernel_size=kernel_size, padding='same'))
self.conv2d_seq.add_module(f"relu_{k}",torch.nn.ReLU())
in_filters = out_filters

其次,如果您在初始块上运行torchinfosummary,您将看到:

==========================================================================================
Layer (type:depth-idx)                   Output Shape              Param #
==========================================================================================
Conv2dBlock                              [1, 64, 64, 64]           --
├─Sequential: 1-1                        [1, 64, 64, 64]           --
│    └─Conv2d: 2-1                       [1, 64, 64, 64]           1,792
│    └─ReLU: 2-2                         [1, 64, 64, 64]           --
│    └─Conv2d: 2-3                       [1, 64, 64, 64]           36,928
│    └─ReLU: 2-4                         [1, 64, 64, 64]           --
==========================================================================================
Total params: 38,720
Trainable params: 38,720
Non-trainable params: 0
Total mult-adds (M): 158.60
==========================================================================================
Input size (MB): 0.05
Forward/backward pass size (MB): 4.19
Params size (MB): 0.15
Estimated Total Size (MB): 4.40
==========================================================================================

因此,您可以看到您有两个conv层(1,792 + 36,928),因为您在for循环中指定了两个层:for k in range(2)

最新更新