小贝子编程

为什么我们要在 train_test_split 的两个数组中包含目标类?

本文关键字：两个数组目标包含我们 train split test machine-learning scikit-learn train-test-split
更新时间 : 2023-09-16
英文 : Why do we include the target class in both the arrays in train_test_split?

X_train, test_df, y_train, y_test = train_test_split(result, y_true, stratify = y_true, test_size = 0.2)

在上面的 train_test_split 示例使用中，result是数据框，y_true是由数据框的目标类列形成的 numpy 数组。

我的问题是，如果我们已经单独给出"y_true"，为什么我们要将整个"结果"数据框作为train_test_split中的输入参数之一？我的意思是，我们不应该首先从"结果"数据框中排除目标类列吗？

Scikit-learn支持熊猫，但熊猫不是必需的。对于 numpy 数组，将特征和标签放在同一个数组中并不总是有意义的，因此train_test_split函数的当前设计。因此，由你来确保你的result数据帧及其拆分具有所需的格式。如果y_true是result数据帧的一部分，则可以(并且应该(选择在函数调用之前或之后排除它。

为什么我们要在 train_test_split 的两个数组中包含目标类?

相关内容

最新更新

热门标签：