尽管sktime有相同数量的病例,但病例数量不匹配



在sktime 中学习分类

from sklearn.model_selection import train_test_split
X = AUDCHF_h1_model[['Open','High','Low','Close','Volume','VWMA',
'Minute','Hour','Day','Week','Month','Year']].values
y = AUDCHF_h1_model[['is_beg_leg']].values
X_train,X_test,y_train,y_test = train_test_split(
X, y, test_size=0.2)
print(X_train.shape, y_train.shape, X_test.shape, y_test.shape)

(53250,12((53250、1((13313、12((13313,1(

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline
from sktime.classification.compose import ColumnEnsembleClassifier
from sktime.classification.dictionary_based import BOSSEnsemble
from sktime.classification.interval_based import TimeSeriesForestClassifier
#from sktime.classification.shapelet_based import MrSEQLClassifier
from sktime.datasets import load_basic_motions
from sktime.transformations.panel.compose import ColumnConcatenator
steps = [
("concatenate", ColumnConcatenator()),
("classify", TimeSeriesForestClassifier(n_estimators=100)),
]
clf = Pipeline(steps)
clf.fit(X_train, y_train)
clf.score(X_test, y_test)

我收到

ValueError:案例数不匹配。X中的数量=639000个,y=53250

X_train.shape(53250,12(y_train.shape(53250,1(

谁知道呢?

根据您提供的信息,我不能肯定地说什么,但我怀疑问题出在您的管道中的ColumnConcatenator,它堆叠了X的所有列,以创建一个53250*12=639000行的新的单变量时间序列。然后,这个串联的时间序列被传递到TimeSeriesForestClassifier,并且具有与原始输入不同的形状。根据您的用例,您现在可以删除";级联的";步骤,否则您必须为新创建的单变量时间序列提供目标值。

相关内容

  • 没有找到相关文章

最新更新