我正在尝试执行以下代码:
heart_df = pd.read_csv(r"location")
X = heart_df.iloc[:, :-1].values
y = heart_df.iloc[:, 11].values
new_df = X[["Sex", "ChestPainType", "RestingECG", "ExerciseAngina", "ST_Slope"]].values() #this is line 17
cat_cols = new_df.copy()
和获取IndexError类似:
File "***location***", line 17, in <module>
new_df = X[["Sex", "ChestPainType", "RestingECG", "ExerciseAngina", "ST_Slope"]].values()
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
据我所知,当我们使用浮点数作为索引时,就会出现IndexError,但不明白为什么会出现这种情况。
在这里,通过创建new_df和cat_cols,我希望在稍后阶段分离分类列以应用OneHotEncoding。
数据集在这里:https://www.kaggle.com/fedesoriano/heart-failure-prediction.
错误来自:
X = heart_df.iloc[:, :-1].values
.values部分将数据帧转换为numpy数组,X中的某些列与numpy数组不兼容。
所以我们可以写相同的:
X = heart_df.iloc[:, :-1]
new_df = X[["Sex", "ChestPainType", "RestingECG", "ExerciseAngina", "ST_Slope"]]