如何在tf2.keras中的微调中冻结BERT的某些层

我正试图为文本分类任务在数据集上微调"基于bert的uncased"。以下是我下载模型的方式：

import tensorflow as tf
from transformers import TFAutoModelForSequenceClassification, AutoTokenizer
model = TFAutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=num_labels)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

由于bert基底有12层，我只想微调最后2层，以防止过拟合。model.layers[i].trainable = False没有帮助。因为model.layers[0]给出了整个伯特基模型，如果我将trainable参数设置为False，那么伯特的所有层都将被冻结。以下是model:的体系结构

Model: "tf_bert_for_sequence_classification"
_________________________________________________________________
Layer (type)                Output Shape              Param #   
=================================================================
bert (TFBertMainLayer)      multiple                  109482240 

dropout_37 (Dropout)        multiple                  0         

classifier (Dense)          multiple                  9997      

=================================================================
Total params: 109,492,237
Trainable params: 109,492,237
Non-trainable params: 0
_________________________________________________________________

此外，我想使用model.layers[0].weights[j]._trainable = False；但CCD_ 7列表具有199个CCD_。所以我不知道哪些权重与最后两层有关。有人能帮我解决这个问题吗？

我找到了答案，并在这里分享。希望它能帮助别人。借助这篇关于使用pytorch微调bert的文章，tensorflow2.keras中的等价物如下：

model.bert.encoder.layer[i].trainable = False

其中i是适当层的索引。

相关内容

最新更新

热门标签：