Scikit索引从界限出来



我在scikit上很新,我在试图通过采样培训集安装学习者的同时,我会从界限出错范围

这是发生错误的地方

def train_predict(learner, sample_size, X_train, y_train, X_test, y_test): 

    results = {}
    start = time() # Get start time
    learner.fit(X_train[sample_size],y_train[sample_size])
    end = time() # Get end time
    results['train_time'] = end-start
    start = time() # Get start time
    predictions_test = learner.predict(X_test)
    predictions_train = learner.predict(X_train.head(300))
    end = time() # Get end time
    results['pred_time'] = end-start
    results['acc_train'] = accuracy_score(y_train.head(300),predictions_train)
    results['acc_test'] = accuracy_score(y_test,predictions_test)
    results['f_train'] = f_score(y_train.head(300),predictions_train)
    results['f_test'] = f_score(y_test,predictions_test)
    print "{} trained on {} samples.".format(learner.__class__.__name__, sample_size)
    return results

这是主要代码

    clf_A = GaussianNB()
    clf_B = tree.DecisionTreeClassifier()
    clf_C = SVC()
    samples_1 = random.sample(X_train.index,len(X_train)/100)
    samples_10 = random.sample(X_train.index,len(X_train)/10)
    samples_100 = X_train.index
    results = {}
    for clf in [clf_A, clf_B, clf_C]:
        clf_name = clf.__class__.__name__
        results[clf_name] = {}
        for i, samples in enumerate([samples_1, samples_10, samples_100]):
             results[clf_name][i] = 
             train_predict(clf, samples, X_train, y_train, X_test, y_test)

     vs.evaluate(results, accuracy, fscore)

erro在线

---> 21     learner.fit(X_train[sample_size],y_train[sample_size])

它说

IndexError: indices are out-of-bounds

您的错误完全取决于x_train和y_train的样子。

一个可能适合您情况的常见例子:如果这些是pandas dataframe对象,那么修复解决方案可能与添加.as_matrix()()一样简单:

learner.fit(X_train.as_matrix()[sample_size],y_train.as_matrix()[sample_size])

您可以检查的另一件事是,x_train [sample_size]返回的行数和y_train [sample_size]返回的行数是相同的。请注意,这是不是与以下评估对true相同,因为x_train [sample_size]可以比y_train [sample_size]具有更多的列:

len(X_train[sample_size]) == len(y_train[sample_size])

在问题中提供有关X_train和Y_Train的构建方式或有关其类型和形状的细节的信息,将为您提供更具体的答案。

尝试以下

learner.fit(X_train[**0:sample_size**],y_train[**0:sample_size**])

相关内容

  • 没有找到相关文章

最新更新