我使用以下代码段加载BERT模型:
name = "bert-base-uncased"
from transformers import BertModel
from transformers import BertTokenizer
print("[ Using pretrained BERT embeddings ]")
self.bert_tokenizer = BertTokenizer.from_pretrained(name, do_lower_case=lower_case)
self.bert_model = BertModel.from_pretrained(name)
if fix_emb:
print("[ Fix BERT layers ]")
self.bert_model.eval()
for param in self.bert_model.parameters():
param.requires_grad = False
else:
print("[ Finetune BERT layers ]")
self.bert_model.train()
但我得到以下错误:
Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
这里出了什么问题?
这些可能会有所帮助。
您正在BertForSequenceClassification模型中加载基于bert的检查点(这是一个使用与BertForPreTraining类似的架构进行训练的检查点(。
这意味着:
The layers that BertForPreTraining has, but BertForSequenceClassification does not have will be discarded
The layers that BertForSequenceClassification has but BertForPreTraining does not have will be randomly initialized.
这是意料之中的事,告诉你,在微调BertForSequenceClassification模型之前,它不会有很好的性能