值错误:调用层"tf.__operators__.getitem_20"时遇到异常(类型切片OpLambda)



遵循tensorflow的教程,并尝试自己用多标签输入功能重新创建代码,遇到了这个错误。我重新创建了示例代码,如下所示:

DataFrame创建:

sample_df = pd.DataFrame({"feature_1": [['aa', 'bb','cc'], ['cc', 'dd', 'ee'], ['cc', 'aa', 'ee']], "feature_2": [['aa', 'bb','cc'], ['cc', 'dd', 'ee'], ['cc', 'aa', 'ee']]})
Output:
feature_1   feature_2
0   [aa, bb, cc]    [aa, bb, cc]
1   [cc, dd, ee]    [cc, dd, ee]
2   [cc, aa, ee]    [cc, aa, ee]
输入层:

inputs = {}
inputs['feature_1'] = tf.keras.Input(shape=(), name='feature_1', dtype=tf.string)
inputs['feature_2'] = tf.keras.Input(shape=(), name='feature_2', dtype=tf.string)
Output:
{'feature_1': <KerasTensor: shape=(None,) dtype=string (created by layer 'feature_1')>,
'feature_2': <KerasTensor: shape=(None,) dtype=string (created by layer 'feature_2')>}

预处理层:

preprocessed = []
for name, column in sample_df.items():
vocab = ['aa', 'bb', 'cc', 'dd', 'ee']
lookup = tf.keras.layers.StringLookup(vocabulary=vocab, output_mode='multi_hot')
print(f'name: {name}')
print(f'vocab: {vocab}n')
x = inputs[name][:, tf.newaxis]
x = lookup(x)
preprocessed.append(x)
Output:
name: feature_1
vocab: ['aa', 'bb', 'cc', 'dd', 'ee']
name: feature_2
vocab: ['aa', 'bb', 'cc', 'dd', 'ee']
[<KerasTensor: shape=(None, 6) dtype=float32 (created by layer 'string_lookup_27')>,
<KerasTensor: shape=(None, 6) dtype=float32 (created by layer 'string_lookup_28')>]
模型创建:

preprocessed_result = tf.concat(preprocessed, axis=-1)
preprocessor = tf.keras.Model(inputs, preprocessed_result)
tf.keras.utils.plot_model(preprocessor, rankdir="LR", show_shapes=True)
Output:
<KerasTensor: shape=(None, 12) dtype=float32 (created by layer 'tf.concat_4')>

错误:

preprocessor(dict(sample_df.iloc[:1]))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
.../sample.ipynb Cell 63' in <cell line: 1>()
----> 1 preprocessor(dict(sample_df.iloc[:1]))
File ~/.local/lib/python3.10/site-packages/keras/utils/traceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)
65 except Exception as e:  # pylint: disable=broad-except
66   filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67   raise e.with_traceback(filtered_tb) from None
68 finally:
69   del filtered_tb
File ~/.local/lib/python3.10/site-packages/tensorflow/python/framework/constant_op.py:102, in convert_to_eager_tensor(value, ctx, dtype)
100     dtype = dtypes.as_dtype(dtype).as_datatype_enum
101 ctx.ensure_initialized()
--> 102 return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Exception encountered when calling layer "tf.__operators__.getitem_20" (type SlicingOpLambda).
Failed to convert a NumPy array to a Tensor (Unsupported object type list).
Call arguments received:
• tensor=0    [aa, bb, cc]
Name: feature_2, dtype: object
• slice_spec=({'start': 'None', 'stop': 'None', 'step': 'None'}, 'None')
• var=None

指导联系

任何帮助错误或进一步了解错误将不胜感激。提前谢谢你。

我为任何感兴趣/面临类似问题的人创建了一个解决方案。这意味着这只是一个解决方法而不是解决方案。

解决方案:由于我的多热编码本质上是二进制的,我只是把它们分解成一个单独的特性。

示例代码:

sample_df = pd.DataFrame({"feature_1": [['aa', 'bb','cc'], ['cc', 'dd', 'ee'], ['cc', 'aa', 'ee']]})
feature_1_labels = set()
for i in range(sample_df.shape[0]):
feature_1_labels.update(sample_df.iloc[i]['feature_1'])
for label in sorted(feature_1_labels):
sample_df[label] = 0
for i in range(sample_df.shape[0]):
for label in sample_df.iloc[i]['feature_1']:
sample_df.iloc[i, sample_df.columns.get_loc(label)] = 1
sample_df
Output:
feature_1   aa  bb  cc  dd  ee
0   [aa, bb, cc]    1   1   1   0   0
1   [cc, dd, ee]    0   0   1   1   1
2   [cc, aa, ee]    1   0   1   0   1

注意:这样做将显著增加输入特征的数量。要记住的事

如果我错了,请随时告诉我一个更好的解决办法:)

相关内容

最新更新