tf.data.experimental.make_csv_dataset: ValueError



我正在尝试构建一个数据集,以便在Keras中使用Kaggle上的泰坦尼克号示例。以下是我到目前为止所做的:

train_data = pd.read_csv("/kaggle/input/titanic/train.csv")
all_columns = ['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'] # all the columns names present in the csv
feature_columns = ['Pclass', 'Sex', 'Age', 'SibSp', 'Parch', 'Ticket', 'Fare', 'Cabin', 'Embarked'] # columns that I want to use as features for the training part

train_data = tf.data.experimental.make_csv_dataset(
"/kaggle/input/titanic/train.csv",
batch_size=12,
column_names=all_columns,
select_columns=feature_columns,
label_name='Survived', # name of the 'label' column
na_value="?",
num_epochs=1,
ignore_errors=False)

但是当编译时,我得到这个错误:

495   if label_name is not None and label_name not in column_names:
496     raise ValueError("`label_name` provided must be one of the columns.")
497 
498   def filename_to_dataset(filename):

ValueError:label_name提供必须是其中一列。

但是,正如你所看到的label_namevalue是' survive '并且它出现在all_columns(还有column_names)

你知道吗?

最好

Aymeric

label_name必须包含在select_columns

试题:

train_data = tf.data.experimental.make_csv_dataset(
"/kaggle/input/titanic/train.csv",
batch_size=12,
column_names=all_columns,
select_columns=feature_columns + ['Survived'],
label_name='Survived', # name of the 'label' column
na_value="?",
num_epochs=1,
ignore_errors=False)

相关内容

  • 没有找到相关文章

最新更新