首页 > 解决方案 > PyTorch 中模型的预测始终为 0(模型训练不起作用)

问题描述

我正在尝试在 Pyorch 中为分类任务训练 MLP(标签为 0 和 1 的两个类)。但是,我的模型总是预测类别标签 0,这会导致准确性低。为什么会这样?这是我的代码:

import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, TensorDataset

X_train = torch.rand((200, 3),dtype=torch.float32)
y_train = torch.randint(low = 0, high = 2 , size=(200, 1))
y_train = torch.tensor(y_train, dtype=torch.float32)

X_test = torch.rand((100, 3),dtype=torch.float32)
y_test = torch.randint(low=0,high=2,size=(100, 1))
y_test = torch.tensor(y_test, dtype=torch.float32)


dataset_train = TensorDataset(X_train, y_train)
dataset_test = TensorDataset(X_test, y_test)

train_loader = DataLoader(dataset_train, batch_size = 50, shuffle= True)
test_loader = DataLoader(dataset_test, batch_size = 50, shuffle= True)

class MLP(nn.Module):
   def __init__(self):
       super(MLP, self).__init__()
       self.mlp = nn.Sequential(
           nn.Linear(3, 10),
           nn.ReLU(), 
           nn.Linear(10, 10),
           nn.ReLU(), 
           nn.Linear(10, 1),
           nn.Sigmoid()
       )
   def forward(self, x):
       out = self.mlp(x)
       return out

model = MLP()

optimizer = torch.optim.Adam(model.parameters()) # , lr=0.001 hier optional eine learning rate angeben (ist das die start learning rate?)

criterion = torch.nn.BCELoss() #wenn man einfach nur normalen BCE loss verwenden will

for current_epoch in range(100):
   for batch_num, input_data in enumerate(train_loader):
       optimizer.zero_grad()
       x, y = input_data
       x = x.float()
       output = model(x)
       loss = criterion(output, y)
       loss.backward()
       optimizer.step()

当我做出预测并打印它们时,我可以看到我的模型总是预测为 0:

with torch.no_grad():
   predictions = model(X_train)
   predictions = predictions.to(torch.long)
   y_pred = predictions.numpy()
   y_true = y_train.numpy()
   
for i in range(len(y_true)):
   a = y_pred[i] == y_true[i]
   print(i, y_pred[i], y_true[i], a)

为什么会这样,我该如何解决这个问题?

标签: pythonmachine-learningpytorchclassification

解决方案


您生成训练和测试数据的方式可能是一个问题,因为您从相同的均匀分布生成特征张量和标签。这本质上意味着标签和 MLP 试图学习的特征向量之间没有真正的模式/关系。通过输出0它至少是正确的大约一半的时间。

使用新的测试数据尝试相同的 MLP,例如从 2 个多元高斯分布中抽取的样本:

from torch.distributions.multivariate_normal import MultivariateNormal
# create 2 multivariate gaussian distributions
m1 = MultivariateNormal(torch.tensor([3.,3.,3.]),torch.eye(3) * .9)
m2 = MultivariateNormal(torch.tensor([1.,1.,1.]),torch.eye(3) * .5)

x_train = torch.vstack([m1.sample((200,)), m2.sample((200,))])
y_train = torch.vstack([torch.zeros(200,3), torch.ones(200,3)])

x_test = torch.vstack([m1.sample((100,)), m2.sample((100,))])
y_test = torch.vstack([torch.zeros(100,3), torch.ones(100,3)])

...

看看您的网络是否会为此类测试数据预测正确的类别。此外,您可以通过调整配送中心之间的距离来验证其性能,并查看准确性是否相应增加/减少。


推荐阅读