模型上的 pyweka警告



我的问题是为什么我收到这个警告:

java.beans.IntrospectionException: Method not found: isNumToSelect
java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
weka.core.PropertyPath.find(PropertyPath.java:386)
weka.core.SetupGenerator.setup(SetupGenerator.java:499)
weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:829)
at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:110)
at java.desktop/java.beans.PropertyDescriptor.<init>(PropertyDescriptor.java:74)
at weka.core.PropertyPath.find(PropertyPath.java:386)
at weka.core.SetupGenerator.setup(SetupGenerator.java:499)
at weka.classifiers.meta.multisearch.DefaultEvaluationTask.doRun(DefaultEvaluationTask.java:83)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:113)
at weka.classifiers.meta.multisearch.AbstractEvaluationTask.call(AbstractEvaluationTask.java:34)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)

不知道如何解决,如果模型正确执行,但在解决方案之前这么长时间内打印了很多次警告。


编辑:

这是我的代码:

base_model_3 = Classifier(classname="weka.classifiers.trees.ADTree", 
options=["-B", "10", "-E", "-3", "-S", "1"])

CostS_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.CostSensitiveClassifier", 
options =["-cost-matrix", "[0.0 2.0; 1.0 0.0]", "-S", "1"])
CostS_cls_model_3.classifier = base_model_3

ROS = Filter(classname="weka.filters.supervised.instance.Resample", options = ["-B","1","-Z","165"])
fc_model_3_ROS = FilteredClassifier(options=["-S","1"])
fc_model_3_ROS.filter = ROS
fc_model_3_ROS.classifier = CostS_cls_model_3

bagging_cls_model_3 = SingleClassifierEnhancer(classname="weka.classifiers.meta.Bagging",
options=["-P", "100", "-S", "1", "-num-slots", "1", "-I", "100"])
bagging_cls_model_3.classifier = fc_model_3_ROS

AttS_cls_model_3 = AttributeSelectedClassifier()
AttS_cls_model_3.search = from_commandline('weka.attributeSelection.Ranker -T -1.7976931348623157E308 -N 61', classname=get_classname(ASSearch))
AttS_cls_model_3.evaluator = from_commandline('weka.attributeSelection.InfoGainAttributeEval', classname=get_classname(ASEvaluation))
AttS_cls_model_3.classifier = bagging_cls_model_3

multisearch_cls_model_3 = MultiSearch(options = ["-S", "1","-class-label","1"])
multisearch_cls_model_3.evaluation = "FM"
multisearch_cls_model_3.search = ["-sample-size", "100", "-initial-folds", "2", "-subsequent-folds", "10",
"-initial-test-set", ".", "-subsequent-test-set", ".", "-num-slots", "1"]                        
mparam_model_3 = MathParameter()
mparam_model_3.prop = "numToSelect"
mparam_model_3.minimum = 5.0
mparam_model_3.maximum = 134.0
mparam_model_3.step = 1.0
mparam_model_3.base = 10.0
mparam_model_3.expression = "I"
multisearch_cls_model_3.parameters = [mparam_model_3]
multisearch_cls_model_3.classifier = AttS_cls_model_3

MissingValues = Filter(classname="weka.filters.unsupervised.attribute.ReplaceMissingValues")
fc_model_3_MV = FilteredClassifier(options=["-S","1"])
fc_model_3_MV.filter = MissingValues
fc_model_3_MV.classifier = multisearch_cls_model_3

也许我不能使用"numToSelect"是否有多搜索属性列表?

我还有一个问题,使用sklearn-weka-plugin,存在任何方法可以使用RandomizedSearchCV或GridSearch(来自sklearn)在Bagging模型上使用ADTrees作为基本估计器的参数的良好组合

像这样:

Base_CostS= WekaEstimator(classifier = base_model_1, classname="weka.classifiers.meta.CostSensitiveClassifier", 
options =["-cost-matrix", "[0.0 1.0; 1.0 0.0]", "-S", "1", "-W", "weka.classifiers.trees.ADTree"],
nominal_input_vars=[2,3,4], # which attributes need to be treated as nominal
nominal_output_var=True)    # class is nominal as well
bagging_model = BaggingClassifier(base_estimator = Base_CostS, n_estimators = 100, n_jobs = None, random_state = 1)
param_distributions_BG = {
'n_estimators': [10, 50, 75, 100],
'max_samples'   : [0.2, 0.5, 1.0],
'bootstrap'   : [True, False],
'base__iterations' : [10,15,20],
'base__Expand_Nodes' : ["-3", "-2", "-1", "1"]

}
# Búsqueda por validación cruzada
# ==============================================================================
grid_r = RandomizedSearchCV(
estimator  = bagging_model,
param_distributions = param_distributions_BG,
n_iter     = 50,
scoring = {'Precision':'precision_macro',
'Recall':'recall_macro',
'F1_Score':'f1_macro'},
cv         = RepeatedKFold(n_splits = 5, n_repeats = 5), 
verbose    = 0,
random_state = 1,
return_train_score = True,
refit = refit_aux
)

我不知道是否可以这样做,或者我必须做一些不同的事情,我也想看看"feature_importances_"但是我认为装袋模型并没有这种"功能重要性"的意图。是用SHAP进行分析

MultiSearch使用您定义的属性路径(参数对象的.prop属性)在嵌套的Java对象中查找要应用参数的对象,它不需要/有可以优化的属性的预定义列表。根据你如何嵌套你的分类器、过滤器、属性选择,你必须调整这个路径。

在您的设置中,有以下嵌套:

MultiSearch
|
+- AttributeSelectedClassifier
|
+- Ranker
|
+- InfoGainAttributeEval

任何属性路径都将应用于您为MultiSearch指定的分类器。如果您使用numToSelect,那么MultiSearch将在您的AttributeSelectedClassifier中查找此Java属性。因为这是Ranker对象的属性,所以它找不到它。Ranker对象本身可以通过AttributeSelectedClassifier中的search属性访问。换句话说,您需要使用search.numToSelect作为您的属性路径。

最新更新