我目前正在使用交叉验证训练我的回归网络,我没有任何标签,但是应该映射到特定输出的特定输入,然后该网络应生成映射。似乎在定义折叠的方式上存在一些问题。
我进行交叉验证的方式就是这样:
############################### Training setup ##################################
#Define 10 folds:
seed = 7
np.random.seed(seed)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
print "Splits"
cvscores_loss = []
for train, test in kfold.split(train_set_data_vstacked_normalized,train_set_output_vstacked):
print "Model definition!"
model = Sequential()
#act = PReLU(init='normal', weights=None)
model.add(Dense(output_dim=400,input_dim=400, init="normal",activation=K.tanh))
#act1 = PReLU(init='normal', weights=None)
model.add(Dense(output_dim=400,input_dim=400, init="normal",activation=K.tanh))
#act2 = PReLU(init='normal', weights=None)
model.add(Dense(output_dim=400, input_dim=400, init="normal",activation=K.tanh))
act4=ELU(10000)
model.add(Dense(output_dim=13, input_dim=300, init="normal",activation=act4))
print "Compiling"
model.compile(loss='mean_squared_error', optimizer='RMSprop', metrics=["accuracy"])
print "Compile done! "
print 'n'
print "Train start"
model.fit(train_set_data_vstacked_normalized[train],train_set_output_vstacked[train], nb_epoch=10, verbose=1)
loss, accuracy = model.evaluate(x=train_set_data_vstacked_normalized[test],y=train_set_output_vstacked[test],verbose=1)
print
print('loss: ', loss)
print('accuracy: ', accuracy)
print()
print model.summary()
print "New Model:"
cvscores_loss.append(loss)
print("%.2f%% (+/- %.2f%%)" % (numpy.mean(cvscores_loss), numpy.std(cvscores_loss)))
此代码的问题是我从不输入for循环。在打印"拆分"后,收到警告消息...
Splits
/home/k/.local/lib/python2.7/site-packages/sklearn/model_selection/_split.py:579: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of groups for any class cannot be less than n_splits=10.
这使得质疑kfold,知道我的神经网络的输入和输出维度是多少?...
我应该在某个地方定义它吗?或如何?..
消息告诉您问题。您的目标课之一只有1个成员。当它分成10倍时,每个班级至少需要10个成员,以便在每个折叠中放1个。
您需要检查目标类的计数以找到有问题的类并删除它。
我认为您对此感到复杂。如果您需要在Keras模型上进行跨验证,则可以使用Keras Scikit-Learn API。为此,您需要:
导入一些东西:
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import cross_val_score
创建一个定义模型的函数:
def model_creation():
model = Sequential()
model.add(...)
...
model.compile(...)
return model
并使用包装器:
model = KerasClassifier(build_fn=model_creation, nb_epoch=100, batch_size=100, verbose=0)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)
results = cross_val_score(model, X, y, cv=kfold)