Pytorch模型不会学习恒等函数?

lc8prwob  于 2023-10-20  发布在  其他
关注(0)|答案(1)|浏览(112)

我在pytorch中写了一些模型,即使经过许多时代,也无法学习任何东西。为了调试这个问题,我做了一个简单的模型,它模拟了输入的恒等函数。困难的是,尽管训练了5万个epoch,这个模型也没有学到任何东西,

import torch
import torch.nn as nn

torch.manual_seed(1)

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.input = nn.Linear(2,4)
        self.hidden = nn.Linear(4,4)
        self.output = nn.Linear(4,2)
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)
        self.dropout = nn.Dropout(0.5)
    def forward(self,x):
        x = self.input(x)
        x = self.dropout(x)
        x = self.relu(x)
        x = self.hidden(x)
        x = self.dropout(x)
        x = self.relu(x)
        x = self.output(x)
        x = self.softmax(x)
        return x

X = torch.tensor([[1,0],[1,0],[0,1],[0,1]],dtype=torch.float)

net = Net()

criterion = nn.CrossEntropyLoss()

opt = torch.optim.Adam(net.parameters(), lr=0.001)

for i in range(100000):
    opt.zero_grad()
    y = net(X)
    loss = criterion(y,torch.argmax(X,dim=1))
    loss.backward()
    if i%500 ==0:
        print("Epoch: ",i)
        print(torch.argmax(y,dim=1).detach().numpy().tolist())
        print("Loss: ",loss.item())
        print()

输出

Epoch:  52500
[0, 0, 1, 0]
Loss:  0.6554909944534302

Epoch:  53000
[0, 0, 0, 0]
Loss:  0.7004914283752441

Epoch:  53500
[0, 0, 0, 0]
Loss:  0.7156486511230469

Epoch:  54000
[0, 0, 0, 0]
Loss:  0.7171240448951721

Epoch:  54500
[0, 0, 0, 0]
Loss:  0.691678524017334

Epoch:  55000
[0, 0, 0, 0]
Loss:  0.7301554679870605

Epoch:  55500
[0, 0, 0, 0]
Loss:  0.728650689125061

我的实现有什么问题?

u3r8eeie

u3r8eeie1#

有几个错误:
1.缺少optimizer.step()
optimizer.step()基于反向传播的梯度和其他累积动量等更新参数。
1.softmaxCrossEntropy损耗的用法
Pytorch CrossEntropyLoss标准将nn.LogSoftmax()nn.NLLLoss()组合在一个类中。也就是说,它应用softmax,然后取负对数。所以在你的例子中,你需要softmax(softmax(output))。正确的方法是使用linear输出层,而training和使用softmax层或仅使用argmax进行预测。
1.小网络丢包率高
这导致了拟合不足
下面是更正的代码:

import torch
import torch.nn as nn

torch.manual_seed(1)

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.input = nn.Linear(2,4)
        self.hidden = nn.Linear(4,4)
        self.output = nn.Linear(4,2)
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)
        # self.dropout = nn.Dropout(0.0)
    def forward(self,x):
        x = self.input(x)
        # x = self.dropout(x)
        x = self.relu(x)
        x = self.hidden(x)
        # x = self.dropout(x)
        x = self.relu(x)
        x = self.output(x)
        # x = self.softmax(x)
        return x

    def predict(self, x):
        with torch.no_grad():
            out = self.forward(x)
        return self.softmax(out)

X = torch.tensor([[1,0],[1,0],[0,1],[0,1]],dtype=torch.float)

net = Net()

criterion = nn.CrossEntropyLoss()

opt = torch.optim.Adam(net.parameters(), lr=0.001)

for i in range(100000):
    opt.zero_grad()
    y = net(X)
    loss = criterion(y,torch.argmax(X,dim=1))
    loss.backward()
    # This was missing before
    opt.step()
    if i%500 ==0:
        print("Epoch: ",i)
        pred = net.predict(X)
        print(f'prediction: {torch.argmax(pred, dim=1).detach().numpy().tolist()}, actual: {torch.argmax(X,dim=1)}')
        print("Loss: ", loss.item())

输出量:

Epoch:  0
prediction: [0, 0, 0, 0], actual: tensor([0, 0, 1, 1])
Loss:  0.7042869329452515
Epoch:  500
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.1166711300611496
Epoch:  1000
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.05215628445148468
Epoch:  1500
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.02993333339691162
Epoch:  2000
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.01916157826781273
Epoch:  2500
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.01306679006665945
Epoch:  3000
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.009280549362301826
.
.
.

相关问题