我正在使用tfx管道来训练和评估自动编码器。我拥有的数据基本上是5个大小为(15,1)的数组,我将它们连接在一起并传递给模型。
为了跟踪训练数据,我在我的ExampleGen中定义了这些参数的平均值。组件。因此,我在输入特征中都有feature1
和feature1_mean
,然而,在变换之后组件,我从数据中删除*_mean
特征。
现在,在我训练了我的模型并想把它传递给评估器之后,出现了这个错误:
unable to prepare inputs for evaluation: input_specs={
'feature1': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None),
'feature2': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None),
'feature3': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None),
'feature4': TensorSpec(shape=(None, 15, 1), dtype=tf.float32, name=None)},
features={
'feature1_mean': array([3.4559317, 3.528199 , 3.3727243, 3.0274842, 3.2321723, 3.339905 , 3.3501785, 2.987716 , 3.236495 , 3.5900073, 3.1439974, 3.1659212, ...], dtype=float32),
'feature2_mean': array([1.5840595 , 1.6105878 , 1.5401138 , 1.2408142 , 1.2962327 ,], dtype=float32),
'feature3_mean': array([1.5840595 , 1.6105878 , 1.5401138 , 1.2408142 , 1.2962327 ,....]}
[while running 'ExtractEvaluateAndWriteResults/ExtractAndEvaluate/EvaluateMetricsAndPlots/ComputeMetricsAndPlots()/CombineMetricsPerSlice/WindowIntoDiscarding']
下面是我为eval_config使用的配置:
eval_config = tfma.EvalConfig(
model_specs=[
tfma.ModelSpec(
signature_name='serving_default',
label_key='feature1_mean',
preprocessing_function_names=['transform_features'],
)
],
metrics_specs=[
tfma.MetricsSpec(
metrics=[
tfma.MetricConfig(class_name='ExampleCount'),
]
)
],
slicing_specs=[
tfma.SlicingSpec()
])
我只是在这里传递feature1_mean
作为虚拟参数名称,因为我实际上没有标签键,因为它是一个无监督学习模型。
我保存的签名是:
def _get_tf_examples_serving_signature(model, tf_transform_output):
"""Returns a serving signature that accepts `tensorflow.Example`."""
# We need to track the layers in the model in order to save it.
# TODO(b/162357359): Revise once the bug is resolved.
model.tft_layer_inference = tf_transform_output.transform_features_layer()
@tf.function(input_signature=[
tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
])
def serve_tf_examples_fn(serialized_tf_example):
"""Returns the output to be used in the serving signature."""
raw_feature_spec = tf_transform_output.raw_feature_spec()
# Remove label feature since these will not be present at serving time.
raw_features = tf.io.parse_example(serialized_tf_example, raw_feature_spec)
raw_features.pop('feature1_mean')
raw_features.pop('feature2_mean')
raw_features.pop('feature3_mean')
raw_features.pop('feature4_mean')
transformed_features = model.tft_layer_inference(raw_features)
logging.info('serve_transformed_features = %s', transformed_features)
result = model(transformed_features)
# TODO(b/154085620): Convert the predicted labels from the model using a
# reverse-lookup (opposite of transform.py).
return {'outputs': result}
return serve_tf_examples_fn
def _get_transform_features_signature(model, tf_transform_output):
"""Returns a serving signature that applies tf.Transform to features."""
# We need to track the layers in the model in order to save it.
# TODO(b/162357359): Revise once the bug is resolved.
model.tft_layer_eval = tf_transform_output.transform_features_layer()
@tf.function(input_signature=[
tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
])
def transform_features_fn(serialized_tf_example):
"""Returns the transformed_features to be fed as input to evaluator."""
raw_feature_spec = tf_transform_output.raw_feature_spec()
raw_features = tf.io.parse_example(serialized_tf_example, raw_feature_spec)
transformed_features = model.tft_layer_eval(raw_features)
logging.info('eval_transformed_features = %s', transformed_features)
return transformed_features
return transform_features_fn
如果你能帮我解决这个问题,我将非常感激。
谢谢。
关于TFMA库和TFX的Evaluator组件,我发现一般来说,输出键必须是一维的,并且必须总是有一个标签键。如果您想让它适用于自动编码器,那么不要在Transform组件中更改_input_fn
,而是使用两个不同的键返回两次输入。例如,如果图像的输入键为img
,则在Transform组件中返回img_input
和img_output
。这样,您就不需要操作Trainer组件的input_fn
,而在Evaluator中,您可以轻松地使用img_output
键作为您的标签。然而,如前所述,这个img_output
必须是一维的。如果在你的模型中,你使用Conv2D层来编码和解码你的图像,我建议首先使用一维数据,但添加一个重塑层,使其为后续的Conv2D层做好准备。
的例子:
encoder_inputs = tf.keras.Input(shape=(60,), name='input_xf')
x = layers.Reshape((15, 4))(encoder_inputs)
x = layers.Conv1D(filter_num*2, 3, activation="relu",
strides=2, padding="valid")(x)
z = layers.Dense(latent_dim)(x)
encoder = tf.keras.Model(encoder_inputs, [z], name="encoder")
latent_inputs = tf.keras.Input(shape=(latent_dim,))
x = layers.Conv1DTranspose(4, 3, padding="same")(x)
decoder_outputs = layers.Reshape((60,))(x)
decoder = tf.keras.Model(latent_inputs, decoder_outputs, name="decoder")