一般来说,我们可以使用 pickle 来保存一个分类器模型。有没有办法在一个泡菜中保存多个分类器模型?如果是,我们如何保存模型并在以后检索它?
例如,(最小工作示例)
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from numpy.random import rand, randint
models = []
models.append(('LogisticReg', LogisticRegression(random_state=123)))
models.append(('DecisionTree', DecisionTreeClassifier(random_state=123)))
# evaluate each model in turn
results_all = []
names = []
dict_method_score = {}
scoring = 'f1'
X = rand(8, 4)
Y = randint(2, size=8)
print("Method: Average (Standard Deviation)n")
for name, model in models:
kfold = model_selection.KFold(n_splits=2, random_state=999)
cv_results = model_selection.cross_val_score(model, X, Y, cv=kfold, scoring=scoring)
results_all.append(cv_results)
names.append(name)
dict_method_score[name] = (cv_results.mean(), cv_results.std())
print("{:s}: {:.3f} ({:.3f})".format(name, cv_results.mean(), cv_results.std()))
目的:使用相同的设置更改一些超参数(例如n_splits交叉验证中),稍后检索模型。
您可以将多个对象保存到同一个泡菜中:
with open("models.pckl", "wb") as f:
for model in models:
pickle.dump(model, f)
然后,您可以一次将模型加载回内存中:
models = []
with open("models.pckl", "rb") as f:
while True:
try:
models.append(pickle.load(f))
except EOFError:
break