类型错误:不支持 /: 'str' 和 'int' ---for 列中的操作数类型X_train.列:



我在代码中不断遇到这个问题,错误出现在粗体行中。我想不出解决办法。脚本大约有250行代码,所以很明显我不能在这里发布所有代码,这可能会阻碍任何人提供帮助。。。提前感谢-我仍然是初学者编码器btw


loans = pd.read_csv(r'C:UserstrainDownloadsaccepted-200000-new.csv',low_memory=True)
.....(more preprocessing code)

loans_train = loans.loc[loans['issue_d'] <  loans['issue_d'].quantile(0.9)]
loans_test =  loans.loc[loans['issue_d'] >= loans['issue_d'].quantile(0.9)]
loans_test.shape[0] / loans.shape[0]
loans_train.drop('issue_d', axis=1, inplace=True)
loans_test.drop('issue_d', axis=1, inplace=True)
y_train = loans_train['charged_off']
y_test = loans_test['charged_off']
X_train = loans_train.drop('charged_off', axis=1)
X_test = loans_test.drop('charged_off', axis=1)
del loans_train, loans_test
linear_dep = pd.DataFrame()
**for col in X_train.columns:
linear_dep.loc[col, 'pearson_corr'] = X_train[col].corr(y_train)
linear_dep['abs_pearson_corr'] = abs(linear_dep['pearson_corr'])**
from sklearn.feature_selection import f_classif
for col in X_train.columns:
mask = X_train[col].notnull()
(linear_dep.loc[col, 'F'], linear_dep.loc[col, 'p_value']) = f_classif(pd.DataFrame(X_train.loc[mask, col]), y_train.loc[mask])
linear_dep.sort_values('abs_pearson_corr', ascending=False, inplace=True)
linear_dep.drop('abs_pearson_corr', axis=1, inplace=True)
linear_dep.reset_index(inplace=True)
linear_dep.rename(columns={'index':'variable'}, inplace=True)

为什么不检查X_train.dtypes?它会向您显示列的确切数据类型。

为了安全起见,您可以只执行:X_train[col].astype(int).corr(y_train)-但是,由于您在所有列上运行它,请确保使用try,因为您可能有纯字符串的列,无论如何都应该省略!

最新更新