我试图将我的scikit学习逻辑回归保存为pmml,但得到了RuntimeError:
我的代码:
from sklearn2pmml import sklearn2pmml
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn.linear_model import LogisticRegression
pipe_pmml = PMMLPipeline(steps=[('mapper', mapper),
('estimator', LogisticRegression(C = 0.01,
penalty = 'l1',
solver = 'liblinear',
random_state = 1))
])
pipe_pmml.fit(X_small, y)
sklearn2pmml(pipe_pmml, pmml_filename, with_repr = True)
错误:
Standard output is empty
Standard error:
Exception in thread "main" net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 0
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:366)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at com.sklearn2pmml.Main.run(Main.java:78)
at com.sklearn2pmml.Main.main(Main.java:6
其中mapper是sklearn_pandas 的DataFrameMapper
有人知道吗?
- sklearn==0.0
- scikit learn==1.1.2
- sklearn panda==2.2.0
- sklearn2pmml==0.86.3
解决方案:将joblib降级到1.1.0
请参阅:https://github.com/jpmml/jpmml-python/issues/19
Joblib1.2.0生成类似pickle的文件,其中包含用于阵列内存对齐的额外填充:Joblib/Joblib#563
这种额外的填充导致标准pickle数据格式读取器失败。
SkLearn2PMML包版本0.87.0(及更新版本(应该能够处理标准(Python pickle,Joblib 1.1.0(和非标准(Joblib 1.2.0(pickle文件。