machine-learning - Pytorch 模型不学习恒等函数？

问题描述

我在 pytorch 中编写了一些模型，即使经过许多 epoch 也无法学习任何东西。为了调试这个问题，我制作了一个简单的模型来模拟输入的身份函数。困难在于，尽管训练了 50k epoch，但这个模型也没有学到任何东西，

import torch
import torch.nn as nn

torch.manual_seed(1)

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.input = nn.Linear(2,4)
        self.hidden = nn.Linear(4,4)
        self.output = nn.Linear(4,2)
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)
        self.dropout = nn.Dropout(0.5)
    def forward(self,x):
        x = self.input(x)
        x = self.dropout(x)
        x = self.relu(x)
        x = self.hidden(x)
        x = self.dropout(x)
        x = self.relu(x)
        x = self.output(x)
        x = self.softmax(x)
        return x


X = torch.tensor([[1,0],[1,0],[0,1],[0,1]],dtype=torch.float)

net = Net()

criterion = nn.CrossEntropyLoss()

opt = torch.optim.Adam(net.parameters(), lr=0.001)


for i in range(100000):
    opt.zero_grad()
    y = net(X)
    loss = criterion(y,torch.argmax(X,dim=1))
    loss.backward()
    if i%500 ==0:
        print("Epoch: ",i)
        print(torch.argmax(y,dim=1).detach().numpy().tolist())
        print("Loss: ",loss.item())
        print()

输出

Epoch:  52500
[0, 0, 1, 0]
Loss:  0.6554909944534302

Epoch:  53000
[0, 0, 0, 0]
Loss:  0.7004914283752441

Epoch:  53500
[0, 0, 0, 0]
Loss:  0.7156486511230469

Epoch:  54000
[0, 0, 0, 0]
Loss:  0.7171240448951721

Epoch:  54500
[0, 0, 0, 0]
Loss:  0.691678524017334

Epoch:  55000
[0, 0, 0, 0]
Loss:  0.7301554679870605

Epoch:  55500
[0, 0, 0, 0]
Loss:  0.728650689125061

我的实施有什么问题？

标签： machine-learningdeep-learningneural-networkpytorch

解决方案

有几个错误：

缺少optimizer.step()：

optimizer.step()根据反向传播的梯度和其他累积的动量等更新参数。

softmaxwith CrossEntropyLoss的用法：

PytorchCrossEntropyLoss 标准将nn.LogSoftmax()和结合nn.NLLLoss()在一个类中。即它应用softmax然后取负对数。因此，在您的情况下，您正在使用softmax(softmax(output))。正确的方法是使用linear输出层，training而使用softmax层或只是argmax用于预测。

小型网络的高 dropout 值：

这导致欠拟合。

这是更正后的代码：

import torch
import torch.nn as nn

torch.manual_seed(1)

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.input = nn.Linear(2,4)
        self.hidden = nn.Linear(4,4)
        self.output = nn.Linear(4,2)
        self.relu = nn.ReLU()
        self.softmax = nn.Softmax(dim=1)
        # self.dropout = nn.Dropout(0.0)
    def forward(self,x):
        x = self.input(x)
        # x = self.dropout(x)
        x = self.relu(x)
        x = self.hidden(x)
        # x = self.dropout(x)
        x = self.relu(x)
        x = self.output(x)
        # x = self.softmax(x)
        return x

    def predict(self, x):
        with torch.no_grad():
            out = self.forward(x)
        return self.softmax(out)


X = torch.tensor([[1,0],[1,0],[0,1],[0,1]],dtype=torch.float)

net = Net()

criterion = nn.CrossEntropyLoss()

opt = torch.optim.Adam(net.parameters(), lr=0.001)


for i in range(100000):
    opt.zero_grad()
    y = net(X)
    loss = criterion(y,torch.argmax(X,dim=1))
    loss.backward()
    # This was missing before
    opt.step()
    if i%500 ==0:
        print("Epoch: ",i)
        pred = net.predict(X)
        print(f'prediction: {torch.argmax(pred, dim=1).detach().numpy().tolist()}, actual: {torch.argmax(X,dim=1)}')
        print("Loss: ", loss.item())

输出：

Epoch:  0
prediction: [0, 0, 0, 0], actual: tensor([0, 0, 1, 1])
Loss:  0.7042869329452515
Epoch:  500
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.1166711300611496
Epoch:  1000
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.05215628445148468
Epoch:  1500
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.02993333339691162
Epoch:  2000
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.01916157826781273
Epoch:  2500
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.01306679006665945
Epoch:  3000
prediction: [0, 0, 1, 1], actual: tensor([0, 0, 1, 1])
Loss:  0.009280549362301826
.
.
.

machine-learning - Pytorch 模型不学习恒等函数？

问题描述

解决方案

推荐阅读