我正在尝试使用Tensorflow训练模型。我正在使用tf.data.experimental.make_csv_dataset
阅读一个巨大的csv文件下面是我的代码:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing
LABEL_COLUMN = 'venda_qtde'
读取csv到tf.data.Dataset
def get_dataset(file_path, **kwargs):
dataset = tf.data.experimental.make_csv_dataset(
file_path,
batch_size=4096,
na_value="?",
label_name=LABEL_COLUMN,
num_epochs=1,
ignore_errors=False,
shuffle=False,
**kwargs)
return dataset
构建模型实例
def build_model():
model = None
model = keras.Sequential([
layers.Dense(520, activation='relu'),
layers.Dense(520, activation='relu'),
layers.Dense(520, activation='relu'),
layers.Dense(1)
])
model.compile(loss='mean_squared_error',
optimizer='adam',
metrics=['mae'])
return model
执行函数:
ds_treino = get_dataset('data/processed/curva_a/curva_a_train.csv')
nn_model = build_model()
nn_model.fit(ds_treino, epochs=10)
但是当fit函数被调用时,我得到错误:
ValueError: Layer sequential_5 expects 1 inputs, but it received 520 input tensors. Inputs received: ...
我的数据集有519个特征和1个标签,大约17M行有人能告诉我我哪里做错了吗?
函数make_csv_dataset
将返回一个特征为字典的tf.data.Dataset
。
train_dataset
<PrefetchDataset shapes: (OrderedDict([... features ... ])
你需要将它们配对成特征和标签。你可以使用:
def features_and_labels(features, labels):
features = tf.stack(list(features.values()), axis=1)
return features, labels
train_dataset = train_dataset.map(features_and_labels)
train_dataset
<MapDataset shapes: ((None, 10), (None,)), types: (tf.float32, tf.int32)>
之后,您应该可以将其传递给fit()
函数。