我正在尝试运行代码:
perm = PermutationImportance(clf).fit(X_test, y_test)
eli5.show_weights(perm)
了解哪些功能在模型中最重要,但是输出为
<IPython.core.display.HTML object>
这个问题有任何解决方案或解决方案?
谢谢您的建议!
( spyder维护者在这里)目前(2019年2月)在我们的游戏机中显示Web内容,对不起。
Note :我们正在考虑如何实现这一目标,但是很可能要等到2023年才能提供。
kiudge只是显示html:
with open('C:Tempdisppage.htm','wb') as f: # Use some reasonable temp name
f.write(htmlobj.html.encode("UTF-8"))
# open an HTML file on my own (Windows) computer
url = r'C:Tempdisppage.htm'
webbrowser.open(url,new=2)
感谢您的想法J Hudok。以下是我的工作示例
from sklearn.datasets import load_iris
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
import eli5
from eli5.sklearn import PermutationImportance
from sklearn.model_selection import train_test_split
import webbrowser
# Load iris data & convert to dataframe
iris_data = load_iris()
data = pd.DataFrame({
'sepal length': iris_data.data[:,0],
'sepal width': iris_data.data[:,1],
'petal length': iris_data.data[:,2],
'petal width': iris_data.data[:,3],
'species': iris_data.target
})
X = data[['sepal length', 'sepal width', 'petal length', 'petal width']]
y = data['species']
# Split train & test dataset
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
# Initialize classifier
clf = RandomForestClassifier(n_estimators=56, max_depth=8, random_state=1, verbose=1)
clf.fit(X_train, y_train)
# Compute permutation feature importance
perm_importance = PermutationImportance(clf, random_state=0).fit(X_test, y_test)
# Store feature weights in an object
html_obj = eli5.show_weights(perm_importance, feature_names = X_test.columns.tolist())
# Write html object to a file (adjust file path; Windows path is used here)
with open('C:\Tmp\Desktopiris-importance.htm','wb') as f:
f.write(html_obj.data.encode("UTF-8"))
# Open the stored HTML file on the default browser
url = r'C:\Tmp\Desktopiris-importance.htm'
webbrowser.open(url, new=2)
我找到了Spyder的解决方案:
clf.fit(X_train, y_train)
onehot_columns = list(clf.named_steps['preprocessor'].named_transformers_['cat'].named_steps['onehot'].get_feature_names(input_features=categorical_features))
numeric_features_list = list(numeric_features)
numeric_features_list.extend(onehot_columns)
numeric_features_list = np.array(numeric_features_list)
selected_features_bool =list(clf.named_steps['feature_selection'].get_support(indices=False))
numeric_features_list = list(numeric_features_list[selected_features_bool])
eli5.format_as_dataframe(eli5.explain_weights(clf.named_steps['classification'], top=50, feature_names=numeric_features_list))
因此,它为我提供了数据框架格式的输出:
0 region_BAKI 0.064145
1 call_out_offnet_dist_w1 0.025365
2 trf_Bolge 0.022637
3 call_in_offnet_dist_w1 0.018974
4 device_os_name_Proprietary 0.018608
...