如何在keras中使用Bert作为长文本分类中的段落编码器来实现网络



我正在做一项长文本分类任务,该任务在doc中有超过10000个单词,我计划使用Bert作为段落编码器,然后将段落的嵌入逐步馈送到BiLSTM。网络如下:

输入:(batch_size,max_paragraph_len,max_tokens_per_para,embedding_size(

bert层:(max_paragraph_len,paragraph_embedding_size(

lstm层:???

输出层:(batch_size,classification_size(

如何使用keras实现它?我正在使用keras的load_trained_model_from_checkpoint加载bert模型

bert_model = load_trained_model_from_checkpoint(
config_path,
model_path,
training=False,
use_adapter=True,
trainable=['Encoder-{}-MultiHeadSelfAttention-Adapter'.format(i + 1) for i in range(layer_num)] +
['Encoder-{}-FeedForward-Adapter'.format(i + 1) for i in range(layer_num)] +
['Encoder-{}-MultiHeadSelfAttention-Norm'.format(i + 1) for i in range(layer_num)] +
['Encoder-{}-FeedForward-Norm'.format(i + 1) for i in range(layer_num)],
)

我相信您可以查看下面的文章。作者展示了如何加载预先训练的BERT模型,将其嵌入Keras层,并将其用于定制的深度神经网络。首次安装TensorFlow 2.0 Keras实现的谷歌研究/伯特:

pip install bert-for-tf2

然后运行:

import bert
import os
def createBertLayer():
global bert_layer
bertDir = os.path.join(modelBertDir, "multi_cased_L-12_H-768_A-12")
bert_params = bert.params_from_pretrained_ckpt(bertDir)
bert_layer = bert.BertModelLayer.from_params(bert_params, name="bert")
bert_layer.apply_adapter_freeze()
def loadBertCheckpoint():
modelsFolder = os.path.join(modelBertDir, "multi_cased_L-12_H-768_A-12")
checkpointName = os.path.join(modelsFolder, "bert_model.ckpt")
bert.load_stock_weights(bert_layer, checkpointName)

最新更新