python - 批量大小不断变化,抛出`Pytorch Value Error Expected: input batch size does not match target batch size`
问题描述
我正在与 Bert 进行多标签文本分类任务。
以下是生成可迭代数据集的代码。
from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler
train_set = TensorDataset(X_train_id,X_train_attention, y_train)
test_set = TensorDataset(X_test_id,X_test_attention,y_test)
train_dataloader = DataLoader(
train_set,
sampler = RandomSampler(train_set),
drop_last=True,
batch_size=13
)
test_dataloader = DataLoader(
test_set,
sampler = SequentialSampler(test_set),
drop_last=True,
batch_size=13
)
以下是训练集的维度:
在[]
print(X_train_id.shape)
print(X_train_attention.shape)
print(y_train.shape)
出去[]
torch.Size([262754, 512])
torch.Size([262754, 512])
torch.Size([262754, 34])
应该有 262754 行,每行 512 列。输出应该从 34 个可能的标签中预测值。我将它们分成 13 个批次。
培训代码
optimizer = AdamW(model.parameters(), lr=2e-5)
# Training
def train(model):
model.train()
train_loss = 0
for batch in train_dataloader:
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_labels = batch[2].to(device)
optimizer.zero_grad()
loss, logits = model(b_input_ids,
token_type_ids=None,
attention_mask=b_input_mask,
labels=b_labels)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
optimizer.step()
train_loss += loss.item()
return train_loss
# Testing
def test(model):
model.eval()
val_loss = 0
with torch.no_grad():
for batch in test_dataloader:
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_labels = batch[2].to(device)
with torch.no_grad():
(loss, logits) = model(b_input_ids,
token_type_ids=None,
attention_mask=b_input_mask,
labels=b_labels)
val_loss += loss.item()
return val_loss
# Train task
max_epoch = 1
train_loss_ = []
test_loss_ = []
for epoch in range(max_epoch):
train_ = train(model)
test_ = test(model)
train_loss_.append(train_)
test_loss_.append(test_)
出去[]
Expected input batch_size (13) to match target batch_size (442).
这是我的模型的描述:
from transformers import BertForSequenceClassification, AdamW, BertConfig
model = BertForSequenceClassification.from_pretrained(
"cl-tohoku/bert-base-japanese-whole-word-masking", # 日本語Pre trainedモデル
num_labels = 34,
output_attentions = False,
output_hidden_states = False,
)
我已经明确表示我希望批量大小为 13。但是,在训练过程中 pytorch 抛出运行时错误
数字 442 是从哪里来的?我已经明确表示我希望每个批次的大小为 13 行。
我已经确认每个批次都有尺寸为 [13,512] 的 input_id、尺寸为 [13,512] 的注意力张量和尺寸为 [13,34] 的标签。
在初始化 DataLoader 时,我尝试使用 442 的批量大小,但在一次批量迭代之后,它会抛出另一个Pytorch Value Error Expected: input batch size does not match target batch size
,这次显示:
ValueError: Expected input batch_size (442) to match target batch_size (15028).
为什么批量大小不断变化?数字 15028 是从哪里来的?
以下是我浏览过的一些答案,但在应用到我的源代码时没有运气:
Pytorch CNN 错误:预期输入 batch_size (4) 与目标 batch_size (64) 匹配
提前致谢。非常感谢您的支持:)
解决方案
It looks like this model does not handle multi-target scenario according to documentation:
labels (torch.LongTensor of shape (batch_size,), optional) – Labels for computing the sequence classification/regression loss. Indices should be in [0, ..., config.num_labels - 1]. If config.num_labels == 1 a regression loss is computed (Mean-Square loss), If config.num_labels > 1 a classification loss is computed (Cross-Entropy).
So, you need prepare your labels to have the shape of batch_size
: torch.Size([batch_size])
with class index in a range [0, ..., config.num_labels - 1]
just like for the original pytorch
's CrossEntropyLoss
(see example section).
推荐阅读
- c# - 使用 JSON.NET 将部分 json 反序列化为 C# 中的数据表,其中密钥是动态的
- mysql - 一段时间内的最大平均值
- redis - 加速 400 GB Redis 还原
- android - 如何在具有不同数据的不同 viewpager 中使用相同的片段?
- c - Inline Assembler Syscall PTRACE(不允许操作)
- powershell - 使用 PowerShell 仅复制同一文件的每组的最后一个文件
- sql - TSQL:向“SELECT”结果集添加行
- oracle - 在 AIX 7.1 上构建时出现 oracle_fdw 错误
- intellij-idea - 带有 Intellij 的 Google 计算引擎
- node.js - 使用值而不是控制台日志记录