创建数据帧时出现意外的IndexError

我正在尝试执行以下代码：

heart_df = pd.read_csv(r"location")
X = heart_df.iloc[:, :-1].values
y = heart_df.iloc[:, 11].values
new_df = X[["Sex", "ChestPainType", "RestingECG", "ExerciseAngina", "ST_Slope"]].values() #this is line 17
cat_cols = new_df.copy()

和获取IndexError类似：

  File "***location***", line 17, in <module>
  new_df = X[["Sex", "ChestPainType", "RestingECG", "ExerciseAngina", "ST_Slope"]].values()
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

据我所知，当我们使用浮点数作为索引时，就会出现IndexError，但不明白为什么会出现这种情况。

在这里，通过创建new_df和cat_cols，我希望在稍后阶段分离分类列以应用OneHotEncoding。

数据集在这里：https://www.kaggle.com/fedesoriano/heart-failure-prediction.

错误来自：

X = heart_df.iloc[:, :-1].values

.values部分将数据帧转换为numpy数组，X中的某些列与numpy数组不兼容。

所以我们可以写相同的：

X = heart_df.iloc[:, :-1]
new_df = X[["Sex", "ChestPainType", "RestingECG", "ExerciseAngina", "ST_Slope"]]

相关内容

最新更新

热门标签：