在我构建的模型中,我试图通过用全局最大池替换Flatten层来提高性能。
为了检查形状是否有序,我在网上随机抽取了一个样本:
test = torch.rand((1, 3, 224, 224)) # [N, C, H, W]
foo = nn.Sequential(
nn.Conv2d(3, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.Conv2d(32, 32, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(32),
nn.MaxPool2d(2)
)
foo2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.Conv2d(64, 64, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(64),
nn.MaxPool2d(2)
)
foo3 = nn.Sequential(
nn.Conv2d(64, 128, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(128),
nn.Conv2d(128, 128, kernel_size=3, padding=1),
nn.ReLU(),
nn.BatchNorm2d(128),
nn.MaxPool2d(2)
)
l1 = nn.Sequential(
nn.Dropout(0.5),
nn.Linear(128, 1024),
nn.ReLU(),
nn.Dropout(0.2),
nn.Linear(1024, 10)
)
r1 = foo(test)
print(r1.shape) # torch.Size([1, 32, 112, 112])
r2 = foo2(r1)
print(r2.shape) # torch.Size([1, 64, 56, 56])
r3 = foo3(r2)
print(r3.shape) # torch.Size([1, 128, 28, 28])
# applying global max pooling and reshaping the layer to [N, C]
flat = F.adaptive_max_pool2d(r3, (1, 1))
ff = flat.reshape(flat.size(0), -1)
print(ff.shape) # torch.Size([1, 128])
res = l1(ff)
print(res.shape) # torch.Size([1, 10])
这里的一切似乎都如预期的那样工作。
我的模型类也有这些相同的层,其中包含forward方法,如下所示:
def forward(self, batch: torch.Tensor) -> torch.Tensor:
r1 = self.conv1(batch)
r2 = self.conv2(r1)
r3 = self.conv3(r2)
tmp = F.adaptive_max_pool2d(r3, (1, 1))
flat = r3.view(tmp.size(0), -1)
out = self.linear(flat)
return out
不幸的是,当我尝试通过(Fashion MNIST数据集)运行实际图像时,我得到错误:mat1和mat2形状不能相乘(128x2048和128x1024)
我的批量大小是128,但我不知道2048可能来自哪里。我的图层都不应该输出任何该形状的内容。
完整的错误消息如下:
RuntimeError Traceback (most recent call last)
/root/fashion_mnist.ipynb Cell 7 in <cell line: 1>()
----> 1 runner.train_model(epochs=80, batch_size=128, criterion=loss_fn, optimizer=optim)
/root/fashion_mnist.ipynb Cell 7 in RunModel.train_model(self, epochs, batch_size, criterion, optimizer, device)
113 t_ep = datetime.now()
115 # run train routine
--> 116 train_loss, train_acc = self._run_train(train_loader, criterion, optimizer)
117 self.train_losses[ep] = train_loss
118 self.train_accuracies[ep] = train_acc
/root/fashion_mnist.ipynb Cell 7 in RunModel._run_train(self, train_data, criterion, optimizer)
141 inputs, targets = inputs.cuda(), targets.cuda()
142 optimizer.zero_grad()
--> 144 outputs: torch.Tensor = self.model(inputs)
145 loss: torch.Tensor = criterion(outputs, targets)
147 loss.backward()
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1186, in Module._call_impl(self, *input,**kwargs)
1182 # If we don't have any hooks, we want to skip the rest of the logic in
1183 # this function, and just call forward.
1184 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1185 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1186 return forward_call(*input,**kwargs)
...
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) -> Tensor:
--> 114 return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x2048 and 128x1024)
你知道这里发生了什么吗
该笔记本可从以下网址获得:https://colab.research.google.com/drive/1QGpSpUCbuDz-dktmLCv_YpG6LZjYZ1TM?usp=sharing
1条答案
按热度按时间mf98qq941#
在图层中使用
Flatten()
,而不是view()
。因此,线性图层应如下所示:forward
函数如下所示:我已经在
colab
上测试过了,它工作得很好。以下是摘要输出:
培训师输出: