我发现beam_search()返回生成的令牌的概率分数。根据文档,beam_search = generate(sample=false, num_beams>1)
。在下面的小代码中,beam_search and generate are not consistent
.
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import LogitsProcessorList, MinLengthLogitsProcessor, BeamSearchScorer,MaxLengthCriteria, StoppingCriteriaList
tokenizer = AutoTokenizer.from_pretrained("t5-small")
model = AutoModelForSeq2SeqLM.from_pretrained("t5-small")
model.resize_token_embeddings(len(tokenizer))
model.to("cuda")
seq1 = "summarize: beamsearch and generate does not give the same result"
encoding = tokenizer(
[seq1],
padding="longest",
max_length=128,
truncation=True,
return_tensors="pt",
)
encoder_input_ids, attention_mask = encoding.input_ids.to("cuda"), encoding.attention_mask.to("cuda")
num_beams = 2
input_ids = torch.ones((num_beams, 1), device=model.device, dtype=torch.long)
input_ids = input_ids * model.config.decoder_start_token_id
model_kwargs = {
"encoder_outputs": model.get_encoder()(
encoder_input_ids.repeat_interleave(num_beams, dim=0), return_dict=True
)
}
beam_scorer = BeamSearchScorer(
batch_size=1,
do_early_stopping=True,
num_beams=num_beams,
device=model.device,
)
outputs = model.beam_search(input_ids, beam_scorer,
logits_processor=None,
early_stopping=True,
no_repeat_ngram_size=4,
max_length=64,
**model_kwargs,
output_scores=True,
return_dict_in_generate=True)
# beam_search result":
out = tokenizer.batch_decode(outputs.sequences, skip_special_tokens=True)
print(" ".join(out))
>> beamsearch() and generate() does not give the same result. beamsearch does not give the same result
#generate results:
out = model.generate(encoder_input_ids,
max_length=64,
early_stopping=True,
num_beams=2,
do_sample=False,
num_return_sequences=1)
tokenizer.batch_decode(out, skip_special_tokens=True)
>> ['beamsearch and generate does not give the same result. beamsearch does not provide the same result as beamsearch.']
# Remark1: Generate() and beam_search() does not give the same result.
# Remark2: If I understand correctly, outputs.sequences can be calculated from outputs.scores:
idx = []
for x in outputs.scores:
i = x[0].exp().argmax().item() # here I take the first beam as I think beams are sorted.
idx.append(i)
idx = torch.tensor([idx]).to("cuda")
print(idx) # outputs.sequences()!
What I missed here! My end goal is to get the log-prob of outputs.sequences tokens.
如果您尝试检查和调试generate函数的源代码,您会发现两者的logits_processor是不同的。对于beam_search,您将None分配给logits_processor,它将是一个默认值,可能只包括MinLengthLogitsProcessor。但是对于生成函数,logits_processor有三个处理器:NoRepeatNGramLogitsProcessor, ForcedBOSTokenLogitsProcessor, ForcedEOSTokenLogitsProcessor。这是导致beam_search()和generate()结果不一致的主要原因。
在您的代码中,您已经在beam_search()方法中设置了output_scores=True,这意味着生成的序列将具有相关的概率分数。但是,在generate()方法中,您没有指定output_scores=True,因此默认情况下生成的序列将没有关联的分数。要从beam_search()中获得生成序列中标记的概率分数,可以访问输出对象的scores属性,该属性包含生成序列的分数。然后可以使用这些分数来计算序列中标记的对数概率。