Pytorch中的全局最大合并:运行时错误:mat1和mat2形状不能相乘(128x2048和128x1024)

y1aodyip  于 2022-11-09  发布在  其他
关注(0)|答案(1)|浏览(356)

在我构建的模型中,我试图通过用全局最大池替换Flatten层来提高性能。
为了检查形状是否有序,我在网上随机抽取了一个样本:

test = torch.rand((1, 3, 224, 224))     # [N, C, H, W]

foo = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.Conv2d(32, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.MaxPool2d(2)
        )

foo2 = nn.Sequential(
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.MaxPool2d(2)
        )

foo3 = nn.Sequential(
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(128),
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(128),
            nn.MaxPool2d(2)
        )

l1 = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(128,  1024),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(1024, 10)
        )

r1 = foo(test)
print(r1.shape)    # torch.Size([1, 32, 112, 112])
r2 = foo2(r1)
print(r2.shape)    # torch.Size([1, 64, 56, 56])
r3 = foo3(r2)
print(r3.shape)    # torch.Size([1, 128, 28, 28])

# applying global max pooling and reshaping the layer to [N, C]

flat = F.adaptive_max_pool2d(r3, (1, 1))
ff = flat.reshape(flat.size(0), -1)

print(ff.shape)    # torch.Size([1, 128])
res = l1(ff)
print(res.shape)   # torch.Size([1, 10])

这里的一切似乎都如预期的那样工作。
我的模型类也有这些相同的层,其中包含forward方法,如下所示:

def forward(self, batch: torch.Tensor) -> torch.Tensor:
        r1 = self.conv1(batch)
        r2 = self.conv2(r1)
        r3 = self.conv3(r2)

        tmp = F.adaptive_max_pool2d(r3, (1, 1))
        flat = r3.view(tmp.size(0), -1)

        out = self.linear(flat)

        return out

不幸的是,当我尝试通过(Fashion MNIST数据集)运行实际图像时,我得到错误:mat1和mat2形状不能相乘(128x2048和128x1024)
我的批量大小是128,但我不知道2048可能来自哪里。我的图层都不应该输出任何该形状的内容。
完整的错误消息如下:

RuntimeError                              Traceback (most recent call last)
/root/fashion_mnist.ipynb Cell 7 in <cell line: 1>()
----> 1 runner.train_model(epochs=80, batch_size=128, criterion=loss_fn, optimizer=optim)

/root/fashion_mnist.ipynb Cell 7 in RunModel.train_model(self, epochs, batch_size, criterion, optimizer, device)
    113 t_ep = datetime.now()
    115 # run train routine
--> 116 train_loss, train_acc = self._run_train(train_loader, criterion, optimizer)   
    117 self.train_losses[ep] = train_loss
    118 self.train_accuracies[ep] = train_acc

/root/fashion_mnist.ipynb Cell 7 in RunModel._run_train(self, train_data, criterion, optimizer)
    141 inputs, targets = inputs.cuda(), targets.cuda()
    142 optimizer.zero_grad()
--> 144 outputs: torch.Tensor = self.model(inputs)
    145 loss: torch.Tensor = criterion(outputs, targets)          
    147 loss.backward()

File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1186, in Module._call_impl(self, *input,**kwargs)
   1182 # If we don't have any hooks, we want to skip the rest of the logic in
   1183 # this function, and just call forward.
   1184 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1185         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1186     return forward_call(*input,**kwargs)
...
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x2048 and 128x1024)

你知道这里发生了什么吗
该笔记本可从以下网址获得:https://colab.research.google.com/drive/1QGpSpUCbuDz-dktmLCv_YpG6LZjYZ1TM?usp=sharing

mf98qq94

mf98qq941#

在图层中使用Flatten(),而不是view()。因此,线性图层应如下所示:

self.linear = nn.Sequential(
        nn.Flatten(),
        nn.Dropout(0.5),
        nn.Linear(128,  1024),
        nn.ReLU(),
        nn.Dropout(0.2),
        nn.Linear(1024, 10)
    )

forward函数如下所示:

def forward(self, batch: torch.Tensor) -> torch.Tensor:
    r1 = self.conv1(batch)
    r2 = self.conv2(r1)
    r3 = self.conv3(r2)

    tmp = F.adaptive_max_pool2d(r3, (1, 1))

    out = self.linear(tmp)

    return out

我已经在colab上测试过了,它工作得很好。
以下是摘要输出:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 32, 32, 32]             896
              ReLU-2           [-1, 32, 32, 32]               0
       BatchNorm2d-3           [-1, 32, 32, 32]              64
            Conv2d-4           [-1, 32, 32, 32]           9,248
              ReLU-5           [-1, 32, 32, 32]               0
       BatchNorm2d-6           [-1, 32, 32, 32]              64
         MaxPool2d-7           [-1, 32, 16, 16]               0
            Conv2d-8           [-1, 64, 16, 16]          18,496
              ReLU-9           [-1, 64, 16, 16]               0
      BatchNorm2d-10           [-1, 64, 16, 16]             128
           Conv2d-11           [-1, 64, 16, 16]          36,928
             ReLU-12           [-1, 64, 16, 16]               0
      BatchNorm2d-13           [-1, 64, 16, 16]             128
        MaxPool2d-14             [-1, 64, 8, 8]               0
           Conv2d-15            [-1, 128, 8, 8]          73,856
             ReLU-16            [-1, 128, 8, 8]               0
      BatchNorm2d-17            [-1, 128, 8, 8]             256
           Conv2d-18            [-1, 128, 8, 8]         147,584
             ReLU-19            [-1, 128, 8, 8]               0
      BatchNorm2d-20            [-1, 128, 8, 8]             256
        MaxPool2d-21            [-1, 128, 4, 4]               0
          Flatten-22                  [-1, 128]               0
          Dropout-23                  [-1, 128]               0
           Linear-24                 [-1, 1024]         132,096
             ReLU-25                 [-1, 1024]               0
          Dropout-26                 [-1, 1024]               0
           Linear-27                   [-1, 10]          10,250
================================================================
Total params: 430,250
Trainable params: 430,250
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 2.76
Params size (MB): 1.64
Estimated Total Size (MB): 4.41
----------------------------------------------------------------

培训师输出:

Epoch 1/80 completed in 0:00:32.994402. Train_loss:  1.0680, train accuracy:  0.6225 Test loss:  1.0435, test accuracy:  0.6271
Epoch 2/80 completed in 0:00:32.939861. Train_loss:  0.9726, train accuracy:  0.6578 Test loss:  0.9616, test accuracy:  0.6662
Epoch 3/80 completed in 0:00:32.811203. Train_loss:  0.9015, train accuracy:  0.6851 Test loss:  0.9015, test accuracy:  0.6883
Epoch 4/80 completed in 0:00:32.836747. Train_loss:  0.8361, train accuracy:  0.7119 Test loss:  0.8336, test accuracy:  0.7173

相关问题