python - Python催化剂数据类型冲突
问题描述
我以前使用该catalyst
库来拟合图像的分类模型并且没有问题,但现在我想将它用于“表格”数据,但是,似乎我做错了什么。这是数据集的片段:
agent_1_feat_0 agent_1_feat_1 ... E.T target
0 58.8 85.1 ... 0.837398 True
1 44.8 71.1 ... 0.789474 True
2 46.3 70.8 ... 0.891566 False
3 50.2 77.5 ... 0.505263 False
4 44.9 75.0 ... 0.943396 True
... ... ... ... ... ...
2275 55.1 82.8 ... 0.582090 False
2276 49.0 78.0 ... 0.943396 True
2277 46.4 76.5 ... 0.735714 True
2278 57.9 85.2 ... 0.837398 True
2279 43.9 70.0 ... 0.850877 True
[2280 rows x 98 columns]
然后我将其转换为DataLoader
:
from torch.utils.data import DataLoader, TensorDataset
from sklearn.model_selection import train_test_split
train, val = train_test_split(a1, test_size=0.67)
tt = torch.tensor(train['target'].values.astype('float32'))
train = torch.tensor(train.drop(columns= 'target').values.astype('float32'))
train_a1 = DataLoader(dataset = TensorDataset(train, tt), batch_size = 10, shuffle = True)
vt = torch.tensor(val['target'].values.astype('float32'))
val = torch.tensor(val.drop(columns= 'target').values.astype('float32'))
val_a1 = DataLoader(dataset = TensorDataset(val, vt), batch_size = 10, shuffle = True)
len(train_a1), len(val_a1) # (76, 153)
这是模型:
import torch.nn as nn
torch.manual_seed(69)
class Net(nn.Module):
def __init__(self):
super().__init__()
self.lin = nn.Sequential(
nn.Linear(in_features= train.shape[1], out_features= train.shape[1]),
nn.Linear(in_features= train.shape[1], out_features= 50),
nn.BatchNorm1d(50),
nn.Linear(in_features= 50, out_features= 30),
nn.Linear(in_features= 30, out_features= 30),
nn.Linear(in_features= 30, out_features= 30),
nn.Linear(in_features= 30, out_features= 15),
nn.Linear(in_features= 15, out_features= 15),
nn.Linear(in_features= 15, out_features= 5),
nn.BatchNorm1d(5),
nn.Dropout(p= .2),
nn.ReLU(inplace=True),
)
self.out = nn.Sequential(
nn.Linear(in_features= 5, out_features= 2),
nn.ReLU(inplace=True),
)
def forward(self, x):
# x = x.view(x.size(0), -1)
print('new', x.shape) # torch.Size([10, 97])
x = self.lin(x)
print('lin', x.shape) # torch.Size([10, 5])
x = self.out(x)
print('out', x.shape) # torch.Size([10, 2])
return x
model = Net()
现在,我创建了培训师:
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
from catalyst import dl
runner = dl.SupervisedRunner(input_key="features", output_key="logits", target_key="targets", loss_key="loss")
runner.train(
model= model,
criterion= criterion,
optimizer= optimizer,
loaders= {"train": train_a1, "valid": val_a1},
num_epochs= 1,
callbacks=[
dl.AccuracyCallback(input_key="logits", target_key="targets", topk_args=(1, 2)),
dl.PrecisionRecallF1SupportCallback(
input_key="logits", target_key="targets", num_classes=2
),
],
logdir="./logs",
valid_loader="valid",
valid_metric="loss",
minimize_valid_metric=True,
verbose=False,
load_best_on_end=True,
seed= 69,
)
所以,到目前为止,一切正常,但是当我尝试执行最后一段代码时,我得到了错误:RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'target' in call to _thnn_nll_loss_forward
. 我试图通过dtype= torch.long
在使用时设置来修复它,torch.tensor
但它抱怨数据类型是Long
:)。如何解决这个问题?我怀疑这是CrossEntropyLoss
因为它似乎不是一维数据的最佳选择,但是,我不知道如何解决这个问题。
解决方案
推荐阅读
- javascript - JS函数在大于或小于之间切换
- reactjs - 我在 redux 中更改状态的调度没有被调用?
- scala - 如何在 Scala 中映射多维数组
- javascript - 来自 react-redux 的连接的高阶组件不适用于 Typescript Typings
- apache-spark - 如何在 Spark SQL 中将时间戳列转换为毫秒长列
- powerbi - 衡量经验少于 6 个月的员工的年度趋势
- python - 无法迁移数据库,使用 Python 的 manage.py makemigrations
- java - 是否可以对 HEX 进行签名?
- ruby-on-rails - 清理(DRY)这个控制器的好方法是什么?
- kubernetes-ingress - 如何设置INGRESS_HOST和INGRESS_PORT并访问GATEWAY_URL