python - RuntimeError: 给定组=1,大小为 [64, 32, 3, 3] 的权重,预期输入 [128, 64, 32, 32] 有 32 个通道,但有 64 个通道
问题描述
我正在尝试尝试为什么我们有消失和爆炸梯度,以及为什么Resnet在避免上述两个问题方面如此有帮助。所以我决定训练一个有很多层的普通卷积网络,只是为了知道为什么当我用很多层(例如 20 层)训练时模型LOSS会增加。但是我在某个时候遇到了这个错误,我可以弄清楚可能是什么问题,但我知道它来自我的模型架构。
images.shape: torch.Size([128, 3, 32, 32])
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-80-0ad7109b33c1> in <module>
1 for images, labels in train_dl:
2 print('images.shape:', images.shape)
----> 3 out = model(images)
4 print('out.shape:', out.shape)
5 print('out[0]:', out[0])
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
--> 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),
<ipython-input-78-81b21c16ed79> in forward(self, xb)
31
32 def forward(self, xb):
---> 33 return self.network(xb)
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
--> 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
115 def forward(self, input):
116 for module in self:
--> 117 input = module(input)
118 return input
119
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
725 result = self._slow_forward(*input, **kwargs)
726 else:
--> 727 result = self.forward(*input, **kwargs)
728 for hook in itertools.chain(
729 _global_forward_hooks.values(),
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
421
422 def forward(self, input: Tensor) -> Tensor:
--> 423 return self._conv_forward(input, self.weight)
424
425 class Conv3d(_ConvNd):
/opt/conda/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
418 _pair(0), self.dilation, self.groups)
419 return F.conv2d(input, weight, self.bias, self.stride,
--> 420 self.padding, self.dilation, self.groups)
421
422 def forward(self, input: Tensor) -> Tensor:
RuntimeError: Given groups=1, weight of size [64, 32, 3, 3], expected input[128, 64, 32, 32] to have 32 channels, but got 64 channels instead
我的模型架构是
class Cifar10CnnModel(ImageClassificationBase):
def __init__(self):
super().__init__()
self.network = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=3, padding=1),
nn.Conv2d(32, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2), # output: 64 x 16 x 16
nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2), # output: 128 x 8 x 8
nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2), # output: 256 x 4 x 4
nn.Flatten(),
nn.Linear(256*4*4, 1024),
nn.ReLU(),
nn.Linear(1024, 512),
nn.ReLU(),
nn.Linear(512, 10))
def forward(self, xb):
return self.network(xb)
for images, labels in train_dl:
print('images.shape:', images.shape)
out = model(images)
print('out.shape:', out.shape)
print('out[0]:', out[0])
break
解决方案
我可以通过模型看到,看起来你在序列中的第 4 个 conv 块上打错了。你有
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
但是,您已经将图像转换为 64 个通道,然后将其作为具有 32 个通道的图像传递到下一个 conv 块,这会导致上述错误。
将此修复为:
self.network = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=3, padding=1),
nn.Conv2d(32, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
# Change this from 32 to now 64 like I did here.
nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.MaxPool2d(2, 2), # output: 64 x 16 x 16
萨塔克耆那教
推荐阅读
- django - Django 'NoneType' 对象没有属性 'amount'
- ios - iOS/Swift:如何混合多个 UIView
- javascript - webpack - ReferenceError: $ 未定义
- c - PIC32 UART:U1RXREG 寄存器从不包含除 0 以外的任何值
- installation - Xdebug 加载 C:\php\ext\php_xdebug.dll 失败
- safari - Safari 中的 ShadowRoot 处理
- amazon-web-services - 临时安全凭证 - 如何在给定角色名称和 AWS 账户 ID 的情况下获得访问权限?
- cocoapods - Cocoapods - 主存储库灾难恢复策略
- javascript - 同位素 v2 网格 - 多个过滤器 - 隐藏空过滤器
- java - 有没有办法使用 swagger codegen 自动生成 java 测试代码?