调试TensorFlow服务于BERT模型



我能够按照以下示例使用BERT嵌入部署NLP模型(在CPU和tensorflow模型服务器上使用TF 1.14.0(:https://mc.ai/how-to-ship-machine-learning-models-into-production-with-tensorflow-serving-and-kubernetes/

模型描述非常清晰:

!saved_model_cli show --dir {'tf_bert_model/1'} --all
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['Input-Segment:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 64)
name: Input-Segment:0
inputs['Input-Token:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 64)
name: Input-Token:0
The given SavedModel SignatureDef contains the following output(s):
outputs['dense/Softmax:0'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: dense/Softmax:0
Method name is: tensorflow/serving/predict

服务模型的数据输入格式是字典列表:

data
'{"instances": [{"Input-Token:0": [101, 101, 1962, 7770, 1069, 102, 102, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], "Input-Segment:0": [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]}]}'
r = requests.post("http://127.0.0.1:8501/v1/models/tf_bert_model:predict",
json=data)

我现在正尝试使用TF2.1、HuggingFace转换器库和GPU部署BERT模型,但部署的模型返回400错误或200错误,我不知道如何调试它。我怀疑这可能是数据输入格式问题。

我的型号描述更混乱:

2020-03-20 14:47:03.465762: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2020-03-20 14:47:03.465883: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib64-nvidia
2020-03-20 14:47:03.465900: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['__saved_model_init_op']:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs['__saved_model_init_op'] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is: 
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['attention_mask'] tensor_info:
dtype: DT_INT32
shape: (-1, 128)
name: serving_default_attention_mask:0
inputs['input_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 128)
name: serving_default_input_ids:0
inputs['labels'] tensor_info:
dtype: DT_INT32
shape: (-1, 1)
name: serving_default_labels:0
inputs['token_type_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 128)
name: serving_default_token_type_ids:0
The given SavedModel SignatureDef contains the following output(s):
outputs['output_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 2)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/resource_variable_ops.py:1786: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Defined Functions:
Function Name: '__call__'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/attention_mask'), 'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='inputs/labels')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Option #2
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='attention_mask'), 'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='labels')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Option #3
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='attention_mask'), 'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='labels')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Option #4
Callable with:
Argument #1
DType: dict
Value: {'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='inputs/labels'), 'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/attention_mask')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Function Name: '_default_save_signature'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='attention_mask'), 'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='labels')}
Function Name: 'call_and_return_all_conditional_losses'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='attention_mask'), 'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='labels')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Option #2
Callable with:
Argument #1
DType: dict
Value: {'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='attention_mask'), 'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='labels'), 'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='input_ids')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Option #3
Callable with:
Argument #1
DType: dict
Value: {'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='inputs/labels'), 'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/attention_mask')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']
Option #4
Callable with:
Argument #1
DType: dict
Value: {'labels': TensorSpec(shape=(None, 1), dtype=tf.int32, name='inputs/labels'), 'input_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/input_ids'), 'token_type_ids': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/token_type_ids'), 'attention_mask': TensorSpec(shape=(None, 128), dtype=tf.int32, name='inputs/attention_mask')}
Named Argument #1
DType: str
Value: ['t', 'r', 'a', 'i', 'n', 'i', 'n', 'g']

我也将我的数据输入格式化为字典列表:

data = {"instances": test_deploy_inputs2}
data
{'instances': [{'attention_mask': [1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
1,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0],
'input_ids': [101,
1999,
5688,
1010,
12328,
5845,
2007,
5423,
3593,
28991,
19362,
4588,
4244,
4820,
12553,
12987,
10737,
2008,
23150,
14719,
1011,
20802,
3662,
2896,
3798,
1997,
17953,
14536,
2509,
1998,
6335,
1011,
1015,
29720,
1998,
2020,
11914,
5123,
2013,
6388,
2135,
10572,
27441,
7315,
1012,
102,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0],
'labels': 0,
'token_type_ids': [0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0,
0]}]}

当测试部署的模型时,我得到了一个200的错误:

r = requests.post("http://127.0.0.1:8501/v1/models/fashion_model:predict",
json=data)
r
<Response [200]>

你知道我如何调试这个吗?感谢

我的错!响应[200]并不意味着它不起作用,你可以用看到结果

predictions = json.loads(json_response.text)['predictions']
predictions

最新更新