首页 > 解决方案 > 对 Pytorch 中多层感知器的堆叠张量输入感到困惑

问题描述

我有一个格式为sequence_len x C x H x W = [10, 3, 16, 16] 的输入序列(假设批量大小 = 1)。这些是 10 张图像堆叠在一个火炬张量中。我希望将其传递给 MLP 并从 MLP 获得接下来的 10 个作为预测。MLP 的结构有一个隐藏层,有 32 个单元。如果我将维度 1 的输入展平 - [10, 768]

我当前的代码如下所示:

class MLP3(nn.Module):
    
    def __init__(self, ip_layers):
        super().__init__()
        self.layers = nn.Sequential(
        nn.Linear(ip_layers, 32),
        nn.ReLU(),
        nn.Linear(32, 32),
        nn.ReLU(),
        nn.Linear(32, 10)
    )

    def forward(self, x):
        #forward
        return self.layers(x)

但是,我无法传递整个张量,并且不确定如何从 MLP 获得 10 个输出。任何帮助将不胜感激。TIA

标签: pythondeep-learningpytorchmlp

解决方案


要最终获得 10 个输出,您必须使用多通道模型。为此,您必须base_model__init__函数中定义并修改其他数据集和前向函数并为每个类写入损失。我会为你写一个样板

class SDataset(torch.nn.utils.Dataset):
  def __init__(self):
    ....
  def __getitem__(self, idx):
    ....
    '''
    Till now everything was the same as before.
    Now the below thing is something that you have to change.
    '''
    outimg1 = self.image_seq[idx+1]
    outimg2 = self.image_seq[idx+2]
    .
    .
    outimg10 = self.image_seq[idx+10]

    return X, (outimg1, outimg2 ..... outimg10)

该模型

class Predictor(torch.nn.Module):
  def __init__(self):
    super(Predictor, self).__init__()
    self.base_model = somebasemodel
    self.out1 = # Takes input from base_model and output an image
    self.out2 = # ....
    .
    .
    self.out10 = # ...

  def forward(self, x):
    features = self.base_model(x)
    res1 = self.out1(features)
    .
    .
    res10 = self.out10(features)
    return res1, res2, .... res10

训练步骤

inputs, outputs = batch
preds = model(inputs)
optimizer.zero_grad()
loss1 = criterion(preds[0], outputs[0])
.
.
loss10 = criterion(preds[9], outputs[9])
total_loss = loss1 + loss2 + ... + loss10
total_loss.backward()
optimizer.step()

现在这是最重要的了。


推荐阅读