如何有效地使用由ordereddict组成的tf.data.Dataset



使用TensorFlow 2.3.1,下面的代码片段失败。

import tensorflow as tf
url = "https://storage.googleapis.com/download.tensorflow.org/data/creditcard.zip"
tf.keras.utils.get_file(
origin=url,
fname='creditcard.zip',
cache_dir="/tmp/datasets/",
extract=True)
ds = tf.data.experimental.make_csv_dataset(
"/tmp/datasets/*.csv",
batch_size=2048,
label_name="Class",
select_columns=["V1","V2","Class"],
num_rows_for_inference=None,
shuffle_buffer_size=600,
ignore_errors=True)
model = tf.keras.Sequential(
[
tf.keras.layers.Dense(256, activation="relu"),
tf.keras.layers.Dense(1, activation="sigmoid", name="labeling"),
],
)
model.compile(
optimizer=tf.keras.optimizers.Adam(1e-2),
loss="binary_crossentropy", 
)
model.fit(
ds,
steps_per_epoch=5,
epochs=3,
)

错误堆栈是

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-19-c79f80f9d0fd> in <module>
----> 1 model.fit(
2     ds,
3     steps_per_epoch=5,
4     epochs=3,
5 )
[...]
ValueError: Layer sequential expects 1 inputs, but it received 2 input tensors. Inputs received: [<tf.Tensor 'ExpandDims:0' shape=(2048, 1) dtype=float32>, <tf.Tensor 'ExpandDims_1:0' shape=(2048, 1) dtype=float32>]

到目前为止,我使用的解决方案是

def workaround(features, labels):
return (tf.stack(list(features.values()), axis=1), labels)
model.fit(
ds.map(workaround),
steps_per_epoch=5,
epochs=3,
)

我向你们提出的问题TF大师:

  • 我做的是正确的还是有更好的解决方案
  • 就性能而言,对于不适合内存的数据集,该解决方案可行吗

我不确定您的代码是否适合内存中的数据。

如果没有,你可以这样更改你的代码:

import tensorflow as tf
url = "https://storage.googleapis.com/download.tensorflow.org/data/creditcard.zip"
ds = tf.data.experimental.make_csv_dataset(
"/tmp/datasets/*.csv",
batch_size=2048,
label_name="Class",
select_columns=["V1","V2","Class"],
num_rows_for_inference=None,
ignore_errors=True,
num_epochs = 1,
shuffle_buffer_size=2048*1000, 
prefetch_buffer_size=tf.data.experimental.AUTOTUNE
)
input_list = []
for column in ["V1", "V2"]:
_input = tf.keras.Input(shape=(1,))
input_list.append(_input)
concat = tf.keras.layers.Concatenate(name="concat")(input_list)
dense = tf.keras.layers.Dense(256, activation="relu", name="dense", dtype='float64' )(concat)
output_dense = tf.keras.layers.Dense(1, activation="sigmoid", name="labeling", dtype='float64')(dense)
model = tf.keras.Model(inputs=input_list, outputs=output_dense)
model.compile(
optimizer=tf.keras.optimizers.Adam(1e-2),
loss="binary_crossentropy", 
)
model.fit(
ds,
steps_per_epoch=5,
epochs=10,
)

相关内容

  • 没有找到相关文章

最新更新