预测失败:处理输入时出错:应为字符串，得到的却是dict

我已经完成了TensorFlow的入门教程(https://www.tensorflow.org/get_started/get_started_for_beginners)并对代码进行了一些小的更改，以使其适应我的应用程序。我的案例的特征列如下：

transaction_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Transaction', vocabulary_list=["buy", "rent"])
localization_column = tf.feature_column.categorical_column_with_vocabulary_list(key='Localization', vocabulary_list=["barcelona", "girona"])
dimensions_feature_column = tf.feature_column.numeric_column("Dimensions")
buy_price_feature_column = tf.feature_column.numeric_column("BuyPrice")
rent_price_feature_column = tf.feature_column.numeric_column("RentPrice")
my_feature_columns = [
tf.feature_column.indicator_column(transaction_column),
tf.feature_column.indicator_column(localization_column),
tf.feature_column.bucketized_column(source_column = dimensions_feature_column,
boundaries = [50, 75, 100]),
tf.feature_column.numeric_column(key='Rooms'),
tf.feature_column.numeric_column(key='Toilets'),
tf.feature_column.bucketized_column(source_column = buy_price_feature_column,
boundaries = [1, 180000, 200000, 225000, 250000, 275000, 300000]),
tf.feature_column.bucketized_column(source_column = rent_price_feature_column,
boundaries = [1, 700, 1000, 1300])
]

之后，我保存了模型，以便在Cloud ML引擎中使用它来进行预测。为了导出模型，我添加了以下代码(在评估模型后)：

feature_spec = tf.feature_column.make_parse_example_spec(my_feature_columns)
export_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(feature_spec)
servable_model_dir = "modeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir, export_input_fn)

运行完代码后，我在"modeloutput"目录中获得了正确的模型文件，并在云中创建了模型(如中所述https://cloud.google.com/ml-engine/docs/tensorflow/getting-started-training-prediction#deploy_a_model_to_support_prediction，"部署模型以支持预测")

一旦创建了模型版本，我只需在云外壳上使用以下命令，尝试使用该模型启动在线预测：

gcloud ml-engine predict --model $MODEL_NAME --version v1 --json-instances ../prediction.json

其中$MODEL_NAME是我的模型的名称，prediction.json是一个json文件，包含以下内容：

{"inputs":[
{
"Transaction":"rent",
"Localization":"girona",
"Dimensions":90,
"Rooms":4,
"Toilets":2,
"BuyPrice":0,
"RentPrice":1100
}
]
}

然而，预测失败了，我得到了以下错误消息：

"error"："预测失败：处理输入时出错：应为字符串，而得到类型为dict的｛u'BuyPrice'：0，u'Transaction'：u'rent'，u'Rooms'：4，u'Localization'：u'irona'，u'Toilets'：2，u'RentPrice':1100，u'Dimensions'：90｝。">

错误很明显，应该是字符串而不是字典。如果我检查我的SavedModel SignatureDef，我会得到以下信息：

The given SavedModel SignatureDef contains the following input(s):
inputs['inputs'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: input_example_tensor:0
The given SavedModel SignatureDef contains the following output(s):
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 12)
name: dnn/head/Tile:0
outputs['scores'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/classify

很明显，输入所需的dtype是一个字符串(DT_string)，但我不知道如何格式化输入数据以使预测成功。我尝试过用许多不同的方式编写输入JSON，但我总是遇到错误。如果我在教程中查看预测是如何执行的(https://www.tensorflow.org/get_started/get_started_for_beginners)，我认为很明显，predict输入是作为字典传递的(教程代码中的predict_x)。

那么，我错在哪里了？如何使用这些输入数据进行预测？

谢谢你抽出时间。

根据答案编辑------

根据@Lak的第二个建议，我已经更新了导出模型的代码，所以现在看起来是这样的：

export_input_fn = serving_input_fn
servable_model_dir = "savedmodeloutput"
servable_model_path = classifier.export_savedmodel(servable_model_dir, 
export_input_fn)
...
def serving_input_fn():
feature_placeholders = {
'Transaction': tf.placeholder(tf.string, [None]),
'Localization': tf.placeholder(tf.string, [None]),
'Dimensions': tf.placeholder(tf.float32, [None]),
'Rooms': tf.placeholder(tf.int32, [None]),
'Toilets': tf.placeholder(tf.int32, [None]),
'BuyPrice': tf.placeholder(tf.float32, [None]),
'RentPrice': tf.placeholder(tf.float32, [None])
}
features = {
key: tf.expand_dims(tensor, -1)
for key, tensor in feature_placeholders.items()
}
return tf.estimator.export.ServingInputReceiver(features, feature_placeholders)

之后，我创建了一个新的模型，并向它提供以下JSON来获得预测：

{
"Transaction":"rent",
"Localization":"girona",
"Dimensions":90.0,
"Rooms":4,
"Toilets":2,
"BuyPrice":0.0,
"RentPrice":1100.0
}

注意，我从JSON结构中删除了"inputs"，因为我在进行预测时收到了错误"Unexpected tensor name:inputs"。然而，现在我得到了一个新的、更丑陋的错误：

"error"："预测失败：模型执行过程中的错误：中止错误(代码=StatusCode.INVALID_ARGUMENT，details=\"NodeDef提到属性'T'不在Op索引中：int64>；NodeDef:dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_parse_input/indexs=WhereT=DT_BOOL，_output_shapes=[[？，2]]，_device=\"/job:localhost/replice:0/task:0/device:CPU:0\"。(检查您的GraphDef解释二进制文件是否与您的GraphiDef生成二进制文件同步。)。\n\t[[Node:dnn/input_from_feature_columns/input_layer/Transaction_indicator/to_parse_input/index=WhereT=DT_BOOL，_output_shapes=[[？，2]]，_device=\"/job:localhost/replice:0/task:0/device:CPU:0\"]]\")">

我再次检查了SignatureDef，得到了以下信息：

The given SavedModel SignatureDef contains the following input(s):
inputs['Toilets'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_4:0
inputs['Rooms'] tensor_info:
dtype: DT_INT32
shape: (-1)
name: Placeholder_3:0
inputs['Localization'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder_1:0
inputs['RentPrice'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_6:0
inputs['BuyPrice'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_5:0
inputs['Dimensions'] tensor_info:
dtype: DT_FLOAT
shape: (-1)
name: Placeholder_2:0
inputs['Transaction'] tensor_info:
dtype: DT_STRING
shape: (-1)
name: Placeholder:0
The given SavedModel SignatureDef contains the following output(s):
outputs['class_ids'] tensor_info:
dtype: DT_INT64
shape: (-1, 1)
name: dnn/head/predictions/ExpandDims:0
outputs['classes'] tensor_info:
dtype: DT_STRING
shape: (-1, 1)
name: dnn/head/predictions/str_classes:0
outputs['logits'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/logits/BiasAdd:0
outputs['probabilities'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 12)
name: dnn/head/predictions/probabilities:0
Method name is: tensorflow/serving/predict

我的某些步骤出错了吗？谢谢

最新更新

我运行了一个本地预测，它已经成功执行，收到了预期的预测结果。使用的命令：

gcloud ml-engine local predict --model-dir $MODEL_DIR --json-instances=../prediction.json

其中MODEL_DIR是包含在模型训练中生成的文件的目录。所以问题似乎出在导出模型上。不知何故，导出并稍后用于预测的模型是不正确的。我读过一些关于TensorFlow版本可能是问题根源的文章，但我不明白。我的整个代码不是用同一个TF版本执行的吗？关于这一点有什么想法吗？

谢谢！

问题出在您的服务输入函数中。您正在使用build_parsing_serving_input_receiver_fn，如果要发送tf.Example字符串，则应使用该函数：

https://www.tensorflow.org/api_docs/python/tf/estimator/export/build_parsing_serving_input_receiver_fn

解决此问题的两种方法：

发送tf.Example

example=tf.train.example(features=tf.trane.features(feature=｛'事务'：tf.train.Feature(bytes_list=tf.train.BytesList(value=['rent']))，"rentPrice"：tf.train.Feature(float32_list=tf.train.Float32List(值=[100.0.0))}))string_to_send=示例。SerializeToString()

更改服务输入函数，以便可以在JSON中发送：

def serving_input_fn()：feature_placeholder={'transaction'：tf.占位符(tf.string，[None])，。。。"rentPrice"：tf.占位符(tf.float32，[无])，}功能={key:tf.expand_dims(张量，-1)对于密钥，feature_placeholders.items()中的张量}return tf.experture.export.ServingInputReceiver(features，feature_placeholders)export_input_fn=serving_input_fn

问题已解决：)

经过几次实验，我最终发现我必须使用最新的运行时版本(1.8)创建模型：

gcloud ml-engine versions create v2 --model $MODEL_NAME --origin $MODEL_BINARIES --runtime-version 1.8

相关内容

最新更新

热门标签：