首页 > 解决方案 > Python催化剂数据类型冲突

问题描述

我以前使用该catalyst库来拟合图像的分类模型并且没有问题,但现在我想将它用于“表格”数据,但是,似乎我做错了什么。这是数据集的片段:

      agent_1_feat_0  agent_1_feat_1  ...       E.T  target
0               58.8            85.1  ...  0.837398    True
1               44.8            71.1  ...  0.789474    True
2               46.3            70.8  ...  0.891566   False
3               50.2            77.5  ...  0.505263   False
4               44.9            75.0  ...  0.943396    True
...              ...             ...  ...       ...     ...
2275            55.1            82.8  ...  0.582090   False
2276            49.0            78.0  ...  0.943396    True
2277            46.4            76.5  ...  0.735714    True
2278            57.9            85.2  ...  0.837398    True
2279            43.9            70.0  ...  0.850877    True

[2280 rows x 98 columns]

然后我将其转换为DataLoader

from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split

train, val = train_test_split(a1, test_size=0.67)

tt = torch.tensor(train['target'].values.astype('float32'))
train = torch.tensor(train.drop(columns= 'target').values.astype('float32'))  
train_a1 = DataLoader(dataset = TensorDataset(train, tt), batch_size = 10, shuffle = True)

vt = torch.tensor(val['target'].values.astype('float32'))
val = torch.tensor(val.drop(columns= 'target').values.astype('float32'))  
val_a1 = DataLoader(dataset = TensorDataset(val, vt), batch_size = 10, shuffle = True)

len(train_a1), len(val_a1) # (76, 153)

这是模型:

import torch.nn as nn
torch.manual_seed(69)

class Net(nn.Module):
    def __init__(self):
        super().__init__()

        self.lin = nn.Sequential(
            nn.Linear(in_features= train.shape[1], out_features= train.shape[1]),
            nn.Linear(in_features= train.shape[1], out_features= 50),
            nn.BatchNorm1d(50),
            nn.Linear(in_features= 50, out_features= 30),
            nn.Linear(in_features= 30, out_features= 30),
            nn.Linear(in_features= 30, out_features= 30),
            nn.Linear(in_features= 30, out_features= 15),
            nn.Linear(in_features= 15, out_features= 15),
            nn.Linear(in_features= 15, out_features= 5),
            nn.BatchNorm1d(5),
            nn.Dropout(p= .2),
            nn.ReLU(inplace=True), 
        )

        self.out = nn.Sequential(
            nn.Linear(in_features= 5, out_features= 2),
            nn.ReLU(inplace=True),
        )

    def forward(self, x):
        # x = x.view(x.size(0), -1)
        print('new', x.shape) # torch.Size([10, 97])
        x = self.lin(x)
        print('lin', x.shape) # torch.Size([10, 5])
        x = self.out(x)
        print('out', x.shape) # torch.Size([10, 2])
        
        return x

model = Net()

现在,我创建了培训师:

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

from catalyst import dl

runner = dl.SupervisedRunner(input_key="features", output_key="logits", target_key="targets", loss_key="loss")
runner.train(
    model= model,
    criterion= criterion,
    optimizer= optimizer,
    loaders= {"train": train_a1, "valid": val_a1},
    num_epochs= 1,
    callbacks=[
        dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 2)),
        dl.PrecisionRecallF1SupportCallback(
            input_key="logits", target_key="targets", num_classes=2
        ),
    ],
    logdir="./logs",
    valid_loader="valid",
    valid_metric="loss",
    minimize_valid_metric=True,
    verbose=False,
    load_best_on_end=True,
    seed= 69,
)

所以,到目前为止,一切正常,但是当我尝试执行最后一段代码时,我得到了错误:RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward. 我试图通过dtype= torch.long在使用时设置来修复它,torch.tensor但它抱怨数据类型是Long:)。如何解决这个问题?我怀疑这是CrossEntropyLoss因为它似乎不是一维数据的最佳选择,但是,我不知道如何解决这个问题。

标签: pythonmachine-learningpytorchcatalyst

解决方案


推荐阅读