我正在尝试遵循本教程,但与我的数据:https://www.tensorflow.org/tutorials/structured_data/feature_columns
我所有的数据都是数值。
当我运行这部分代码时:
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(train_ds, validation_data=test_ds, epochs=100, use_multiprocessing=True)
我得到了所有参数的警告类型:
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'age': <tf.Tensor 'ExpandDims_8:0' shape=(None, 1) dtype=int64>,
对于每个变量,我都得到了两次警告!
然后我得到这个错误:
UnimplementedError: Cast string to float is not supported
[[node sequential_7/dense_features_7/calprotectin/Cast (defined at <ipython-input-103-5689ba5df442>:5) ]] [Op:__inference_train_function_4860]
是什么问题,我怎么解决它?
<标题>Edit1 h1> 试着用样本数据模拟我的代码和错误,我想出了这个代码。代码不会生成错误,但会生成警告。所以问题在于我正在阅读的数据。产生这种错误的输入数据可能出了什么问题?
(这是一个jupyter代码,我怎么能把它贴在这里?):
%reset
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras
from tensorflow import feature_column
from sklearn.model_selection import train_test_split
RANDOM_SEED = 42
data=pd.DataFrame()
data['sex']=[1,2,2,1,2,2,1,1,2,1]
data['age']=[10,11,13,45,67,34,23,62,82,78]
data['bmi']=[22.5,28.8,19,23.3,26,18.4,27.5,29,30.3,25.9]
data['smoker']=[1,2,2,3,3,2,2,1,1,1]
data['lab1']=[144,124,126,146,130,124,171,147,131,138]
data['lab2']=[71,82,75,65,56,89,55,74,78,69]
data['result']=[1,2,2,4,3,2,1,3,2,4]
feature_columns = []
for header in ['sex','age', 'bmi','smoker', 'lab1', 'lab2']:
feature_columns.append(tf.feature_column.numeric_column(header))
def create_dataset(dataframe, batch_size=32):
dataframe = dataframe.copy()
labels = dataframe.pop('result')
return tf.data.Dataset.from_tensor_slices((dict(dataframe), labels))
.shuffle(buffer_size=len(dataframe))
.batch(batch_size)
train, test = train_test_split(data, test_size=0.2, random_state=RANDOM_SEED)
train_ds = create_dataset(train)
test_ds = create_dataset(test)
model = tf.keras.models.Sequential([
tf.keras.layers.DenseFeatures(feature_columns=feature_columns),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(.1),
tf.keras.layers.Dense(1)
])
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history = model.fit(train_ds, validation_data=test_ds, epochs=100, use_multiprocessing=True)
当我运行上面的代码时,我得到了这样的警告:
Epoch 1/100
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'sex': <tf.Tensor 'ExpandDims_4:0' shape=(None, 1) dtype=int64>, 'age': <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=int64>, 'bmi': <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=float64>, 'smoker': <tf.Tensor 'ExpandDims_5:0' shape=(None, 1) dtype=int64>, 'lab1': <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=int64>, 'lab2': <tf.Tensor 'ExpandDims_3:0' shape=(None, 1) dtype=int64>}
Consider rewriting this model with the Functional API.
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'sex': <tf.Tensor 'ExpandDims_4:0' shape=(None, 1) dtype=int64>, 'age': <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=int64>, 'bmi': <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=float64>, 'smoker': <tf.Tensor 'ExpandDims_5:0' shape=(None, 1) dtype=int64>, 'lab1': <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=int64>, 'lab2': <tf.Tensor 'ExpandDims_3:0' shape=(None, 1) dtype=int64>}
Consider rewriting this model with the Functional API.
1/1 [==============================] - ETA: 0s - loss: -22.8739 - accuracy: 0.2500WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'dict'> input: {'sex': <tf.Tensor 'ExpandDims_4:0' shape=(None, 1) dtype=int64>, 'age': <tf.Tensor 'ExpandDims:0' shape=(None, 1) dtype=int64>, 'bmi': <tf.Tensor 'ExpandDims_1:0' shape=(None, 1) dtype=float64>, 'smoker': <tf.Tensor 'ExpandDims_5:0' shape=(None, 1) dtype=int64>, 'lab1': <tf.Tensor 'ExpandDims_2:0' shape=(None, 1) dtype=int64>, 'lab2': <tf.Tensor 'ExpandDims_3:0' shape=(None, 1) dtype=int64>}
Consider rewriting this model with the Functional API.
当模型拟合完成时,精度为零。我知道数据是无效的,位的精度为零也是不被期望的。
标题>要修复此警告,有两种方法:
1]调用fit方法时,在输入中应用feature层。即:而不是:
model3.fit(x=train_dict, y=train_labels, validation_data=(valid_dict,valid_labels), epochs=epochs_, verbose=verbose_)
使用
model3.fit(x=feature_layer_3(train_dict), y=train_labels, validation_data=(feature_layer_3(valid_dict),valid_labels), epochs=epochs_, verbose=verbose_)
你可以查看这个详细的例子(第三模型)https://www.kaggle.com/abidou/features-bucketing
2]使用前面链接中第6个模型中的Functional API
在训练模型时没有改进的原因是您对多标签使用了BinaryCrossentropy
损失,请在以下两种情况下处理此错误
-
对于二进制分类:
设
data['result']=[1,0,0,1,0,0,1,0,0,1]
为例,使用loss=tf.keras.losses.BinaryCrossentropy(from_logits=True)
-
对于多类分类:
使用
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
,并修改模型的输出层,使其输出形状匹配标签的数量,例如tf.keras.layers.Dense(5)
当你有5个类
从tf.keras.models.Sequential()
的WARNING
是简单地告诉你它期望从层内为了它的正常工作,如果你不使用tf.keras.models.Sequential()
然后WARNING
将消失,例如定义模型使用:
inputs = {}
for header in ['sex','age', 'bmi','smoker', 'lab1', 'lab2']:
inputs[header] = tf.keras.Input(shape=(1,), name=header)
x = tf.keras.layers.DenseFeatures(feature_columns=feature_columns)(inputs)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dropout(.1)(x)
x = tf.keras.layers.Dense(1)(x)
model = tf.keras.models.Model(inputs=inputs, outputs=x)
您获得Cast string to float
错误的原因可能是由于您试图将所有列转换为numeric column
,就像您在发布的示例代码中所做的那样(即也许最好将sex
列转换为categorical columns
)