批次大小不断变化,引发"Pytorch Value Error Expected:input批次大小与目标批次



我正在与Bert一起执行多标签文本分类任务。

以下是用于生成可迭代数据集的代码。

from torch.utils.data import TensorDataset, DataLoader, RandomSampler, SequentialSampler
train_set = TensorDataset(X_train_id,X_train_attention, y_train)
test_set = TensorDataset(X_test_id,X_test_attention,y_test)
train_dataloader = DataLoader(
train_set,
sampler = RandomSampler(train_set),
drop_last=True,
batch_size=13
)
test_dataloader = DataLoader(
test_set,
sampler = SequentialSampler(test_set),
drop_last=True,
batch_size=13
)

以下是训练集的维度:

在[]中

print(X_train_id.shape)
print(X_train_attention.shape)
print(y_train.shape)

出[]

torch.Size([262754, 512])
torch.Size([262754, 512])
torch.Size([262754, 34])

应该有262754行,每行512列。输出应该预测来自34个可能标签的值。我把它们分成13批。

培训代码

optimizer = AdamW(model.parameters(), lr=2e-5)
# Training
def train(model):
model.train()
train_loss = 0
for batch in train_dataloader:
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_labels = batch[2].to(device)
optimizer.zero_grad()
loss, logits = model(b_input_ids, 
token_type_ids=None, 
attention_mask=b_input_mask, 
labels=b_labels)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
optimizer.step()
train_loss += loss.item()
return train_loss

# Testing
def test(model):
model.eval()
val_loss = 0
with torch.no_grad():
for batch in test_dataloader:
b_input_ids = batch[0].to(device)
b_input_mask = batch[1].to(device)
b_labels = batch[2].to(device)
with torch.no_grad():        
(loss, logits) = model(b_input_ids, 
token_type_ids=None, 
attention_mask=b_input_mask,
labels=b_labels)
val_loss += loss.item()
return val_loss
# Train task
max_epoch = 1
train_loss_ = []
test_loss_ = []
for epoch in range(max_epoch):
train_ = train(model)
test_ = test(model)
train_loss_.append(train_)
test_loss_.append(test_)

出[]

Expected input batch_size (13) to match target batch_size (442).

这是我的型号描述:

from transformers import BertForSequenceClassification, AdamW, BertConfig
model = BertForSequenceClassification.from_pretrained(
"cl-tohoku/bert-base-japanese-whole-word-masking", # 日本語Pre trainedモデル
num_labels = 34, 
output_attentions = False,
output_hidden_states = False,
)

我已经明确表示,我希望批量为13。然而,在训练过程中,pytorch抛出了一个运行时错误

442这个数字是从哪里来的?我已经明确表示,我希望每批都有13行的大小。

我已经确认,每个批次都有维度为[13512]的input_id、维度为[13522]的注意力张量和维度为[13,34]的标签。

我曾尝试在初始化DataLoader时屈服并使用442的批量大小,但在一次批量迭代后,它抛出了另一个Pytorch Value Error Expected: input batch size does not match target batch size,这次显示:

ValueError: Expected input batch_size (442) to match target batch_size (15028).

为什么批量大小不断变化?15028这个数字是从哪里来的?

以下是我看过的一些答案,但在应用到我的源代码时运气不佳:

https://discuss.pytorch.org/t/valueerror-expected-input-batch-size-324-to-match-target-batch-size-4/24498

https://discuss.pytorch.org/t/valueerror-expected-input-batch-size-1-to-match-target-batch-size-64/43071

Pytorch CNN错误:预期输入batch_size(4(与目标batch_sze(64(匹配

提前谢谢。非常感谢您的支持:(

根据文档,该模型似乎无法处理多目标场景:

labels(torch.LongTensor of shape(batch_size,(,可选(–用于计算序列分类/回归损失的标签。索引应位于[0,…,config.num_labels-1]中。如果config.num_labels==1,则计算回归损失(均方损失(,如果config.num _labels>1,则计算分类损失(交叉熵(。

因此,您需要准备标签的形状为batch_size:torch.Size([batch_size]),类索引在[0, ..., config.num_labels - 1]范围内,就像原始pytorchCrossEntropyLoss一样(请参阅示例部分(。

相关内容

最新更新