TensorFlow服务:使用REST-Api向TFServing模型发送多个输入的字典



我正在使用TFServing为BERT模型提供服务,并希望使用REST API提取隐藏层。当在Google Colab中使用该模型时,我可以使用运行推理

inputs = {
"input_ids": input_ids,
"attention_mask": input_mask,
"token_type_ids": input_type_ids
}
test_output = model(bert_inputs)

然后我这样保存模型:

tf.saved_model.save(model, model_save_path)

使用saved_model_cli查看保存的模型,它看起来是这样的。

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:
signature_def['__saved_model_init_op']:
The given SavedModel SignatureDef contains the following input(s):
The given SavedModel SignatureDef contains the following output(s):
outputs['__saved_model_init_op'] tensor_info:
dtype: DT_INVALID
shape: unknown_rank
name: NoOp
Method name is:
signature_def['serving_default']:
The given SavedModel SignatureDef contains the following input(s):
inputs['input_ids'] tensor_info:
dtype: DT_INT32
shape: (-1, 5)
name: serving_default_input_ids:0
The given SavedModel SignatureDef contains the following output(s):
outputs['hidden_states_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:0
outputs['hidden_states_10'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:1
outputs['hidden_states_11'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:2
outputs['hidden_states_12'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:3
outputs['hidden_states_13'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:4
outputs['hidden_states_2'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:5
outputs['hidden_states_3'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:6
outputs['hidden_states_4'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:7
outputs['hidden_states_5'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:8
outputs['hidden_states_6'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:9
outputs['hidden_states_7'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:10
outputs['hidden_states_8'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:11
outputs['hidden_states_9'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:12
outputs['last_hidden_state'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 5, 768)
name: StatefulPartitionedCall:13
outputs['pooler_output'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 768)
name: StatefulPartitionedCall:14
Method name is: tensorflow/serving/predict
Defined Functions:
Function Name: '__call__'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: NoneType
Value: None
Argument #9
DType: NoneType
Value: None
Argument #10
DType: bool
Value: True
Option #2
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: NoneType
Value: None
Argument #9
DType: NoneType
Value: None
Argument #10
DType: bool
Value: False
Function Name: '_default_save_signature'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids')}
Function Name: 'call_and_return_all_conditional_losses'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: NoneType
Value: None
Argument #9
DType: NoneType
Value: None
Argument #10
DType: bool
Value: True
Option #2
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, 5), dtype=tf.int32, name='input_ids/input_ids')}
Argument #2
DType: NoneType
Value: None
Argument #3
DType: NoneType
Value: None
Argument #4
DType: NoneType
Value: None
Argument #5
DType: NoneType
Value: None
Argument #6
DType: NoneType
Value: None
Argument #7
DType: NoneType
Value: None
Argument #8
DType: NoneType
Value: None
Argument #9
DType: NoneType
Value: None
Argument #10
DType: bool
Value: False
Function Name: 'serving'
Option #1
Callable with:
Argument #1
DType: dict
Value: {'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids'), 'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids')}

对于API调用,我在请求中构建输入,以使模型期望与我在正常推理过程中的操作方式相匹配(根据TFServing文档https://www.tensorflow.org/tfx/serving/api_rest):

inference_url = "http://localhost:8501/v1/models/<my_model_name>:predict"
data = {
"instances": [{
"input_ids": input_ids.numpy().tolist(),
"attention_mask": attention_mask.numpy().tolist(),
"token_type_ids": token_type_id.numpy().tolist()
}]
}
headers = {"content-type": "application/json"}
response = requests.post(inference_url, headers = headers, data = json.dumps(data))

我面临的问题是,当调用API端点时:

/v1/models/<my_model_name>:predict 

似乎模型没有期望参数"0";attention_ mask";以及";token_type_ids";。

即使";函数名称:"serving";模型的一部分看起来应该期待两个";input_ids"attention_ mask";以及";token_type_ids";。我仍然从REST API得到以下错误:

{
"error": "Failed to process element: 0 key: attention_mask of 'instances' list. Error: Invalid argument: JSON object: does not have named input: attention_mask"
}

在我看来,这可能与SignatureDef有关。看起来保存的模型实际上只是期望";input_ids";,尽管我保存在Google Colab中的实际模型确实期望一本包含所有三个";input_ids"attention_ mask";以及";token_type_ids";。

我把模型保存错了吗?有人能告诉我我做错了什么吗?

非常感谢!

请检查链接并相应地格式化输入数据:调试在BERT型号上服务的TensorFlow

尝试使用API调用的以下代码段:

import json
import requests
inference_url = "http://localhost:8501/v1/models/<my_model_name>:predict"
data = json.dumps({"signature_name": "serving_default", "instances": [{'input_ids':[input_ids.numpy()], 'attention_mask':[attention_mask.numpy()], 'token_type_ids':[token_type_id.numpy()]}]})
headers = {"content-type": "application/json"}
response = requests.post(inference_url, data=data, headers=headers)   

最新更新