为什么BERT模型找不到与我输入的位置参数匹配的选项



在尝试NLP练习时,我试图利用BERT架构来获得一个良好的训练模型。因此,我定义了一个函数,该函数使用BERT作为层来构建和编译模型。然而,在尝试执行函数并实际构建模型时,我遇到了一个错误,即BERT层找不到与我输入的位置参数匹配的选项。

我的位置参数的维度是[None, 160],但BERT层似乎希望它们是[None, None]。如何解决此问题?

重现我的问题:

这些是我导入的库:

import tensorflow as tf
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
import tensorflow_hub as hub

接下来,我为模型定义了一个函数,如下所示:

# Build and compile the model
def build_model(bert_layer, max_len = 512):
input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
input_mask = Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
clf_output = sequence_output[:, 0, :]
out = Dense(1, activation='sigmoid')(clf_output)

model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)
model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])

return model

接下来,我下载了BERT体系结构,并实例化了bert_layer,如下所示:

module_url = "https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4"
bert_layer = hub.KerasLayer(module_url, trainable=True)

最后,我尝试使用build_model函数和bert_layer构建模型,如下所示:

model = build_model(bert_layer, max_len=160)
model.summary()

但这返回了一个错误,我认为这意味着我输入的维度与所需的维度不同。错误如下所示:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-42-516b88804394> in <module>
----> 1 model = build_model(bert_layer, max_len=160)
2 model.summary()
<ipython-input-41-713013238e2f> in build_model(bert_layer, max_len)
6     segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
7 
----> 8     pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
9     clf_output = sequence_output[:, 0, :]
10     out = Dense(1, activation='sigmoid')(clf_output)
~Anaconda3libsite-packagestensorflow_corepythonkerasenginebase_layer.py in __call__(self, inputs, *args, **kwargs)
840                     not base_layer_utils.is_in_eager_or_tf_function()):
841                   with auto_control_deps.AutomaticControlDependencies() as acd:
--> 842                     outputs = call_fn(cast_inputs, *args, **kwargs)
843                     # Wrap Tensors in `outputs` in `tf.identity` to avoid
844                     # circular dependencies.
~Anaconda3libsite-packagestensorflow_corepythonautographimplapi.py in wrapper(*args, **kwargs)
235       except Exception as e:  # pylint:disable=broad-except
236         if hasattr(e, 'ag_error_metadata'):
--> 237           raise e.ag_error_metadata.to_exception(e)
238         else:
239           raise
ValueError: in converted code:
relative to C:UsersWolemercyAnaconda3libsite-packages:
tensorflow_hubkeras_layer.py:237 call  *
result = smart_cond.smart_cond(training,
tensorflow_corepythonframeworksmart_cond.py:59 smart_cond
name=name)
tensorflow_corepythonsaved_modelload.py:436 _call_attribute
return instance.__call__(*args, **kwargs)
tensorflow_corepythoneagerdef_function.py:457 __call__
result = self._call(*args, **kwds)
tensorflow_corepythoneagerdef_function.py:494 _call
results = self._stateful_fn(*args, **kwds)
tensorflow_corepythoneagerfunction.py:1822 __call__
graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
tensorflow_corepythoneagerfunction.py:2150 _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
tensorflow_corepythoneagerfunction.py:2041 _create_graph_function
capture_by_value=self._capture_by_value),
tensorflow_corepythonframeworkfunc_graph.py:915 func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
tensorflow_corepythoneagerdef_function.py:358 wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
tensorflow_corepythonsaved_modelfunction_deserialization.py:262 restored_function_body
"nn".join(signature_descriptions)))
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (3 total):
* [<tf.Tensor 'inputs:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_1:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_2:0' shape=(None, 160) dtype=int32>]
* True
* None
Keyword arguments: {}

Expected these arguments to match one of the following 4 option(s):

Option 1:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
* False
* None
Keyword arguments: {}

Option 2:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
* False
* None
Keyword arguments: {}

Option 3:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
* True
* None
Keyword arguments: {}

Option 4:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
* True
* None
Keyword arguments: {}

我的期望是该模型能够成功编译。相反,我得到了这个错误。

首先,您需要bert预处理器

bert_preprocessor = hub.load("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")

这将为您提供:input_word_ids、input_mask、segment_ids。您只需将文本传递给bert_preprocessor

然后将您的bert模型添加为KerasLayer

bert_model = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4")

至于微调你的模型:

def bert_funtional_API(seq_length):
text_input = [tf.keras.layers.Input(shape=(),dtype=tf.string)]
tokenize = hub.KerasLayer(bert_preprocessor.tokenize)
tokenized_inputs = [tokenize(segment) for segment in input1]
bert_pack_inputs = hub.KerasLayer(bert_preprocessor.bert_pack_inputs,
arguments=dict(seq_length=seq_length))
encoder_inputs = bert_pack_inputs(tokenized_inputs)
bert_input = bert_encoder(encoder_inputs)
pooled_output = bert_input['pooled_output']
sequence_output = bert_input['sequence_output']
output = Dense(1,activation = 'sigmoid')(sequence_output)
model = Model(inputs = [text_input], outputs = output)
model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])
return model


最新更新