我正在开发一个用于使用keras的通用音频标记的系统。
我有以下数据输入:x_train对于每个输入(data_leng,max,min等(有10个不同的数据,y_train代表41个可能的标签(吉他,低音等(
x_train shape = (7104, 10)
y_train shape = (41,)
print(x_train[0])
[ 3.75732000e+05 -2.23437546e-05 -1.17187500e-02 1.30615234e-02
2.65964586e-03 2.65973969e-03 9.80024859e-02 1.13624850e+00
1.00003528e+00 -1.11458333e+00]
print(y_train[0])
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
我的模型是:
from keras.models import Sequential
from keras.optimizers import SGD
from keras.layers import Dense, Dropout, Activation
model = Sequential()
model.add(Dense(units=128, activation='relu', input_dim=10))
model.add(Dropout(0.5))
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(41, activation='softmax'))
opt = SGD(lr=0.0001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=opt, metrics=['accuracy'])
model.fit(np.array(x_train), np.array(y_train), epochs=5, batch_size=8)
这是我的结果:
Epoch 1/5
7104/7104 [==============================] - 1s 179us/step - loss: 15.7392 - acc: 0.0235
Epoch 2/5
7104/7104 [==============================] - 1s 132us/step - loss: 15.7369 - acc: 0.0236
Epoch 3/5
7104/7104 [==============================] - 1s 133us/step - loss: 15.7415 - acc: 0.0234
Epoch 4/5
7104/7104 [==============================] - 1s 132us/step - loss: 15.7262 - acc: 0.0242
Epoch 5/5
7104/7104 [==============================] - 1s 132us/step - loss: 15.6484 - acc: 0.0291
您可以看到,我的结果显示出很高的数据丢失和非常低的精度,但主要问题是当我尝试预测结果时,每个输入的原因是输出相同。我怎样才能解决这个问题 ?
pre = model.predict(np.array(x_train), batch_size=8, verbose=0)
for i in pre:
print(i)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
...
在密集层中,您需要仅针对第一层指定input_dim。
keras照顾其他层中的昏暗。
所以尝试:
model = Sequential()
model.add(Dense(units=128, activation='relu', input_dim=10))
model.add(Dropout(0.5))
model.add(Dense(units=64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(units=32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(41, activation='softmax'))
也许您的正则化太强了,对于这种数据,请尝试辍学少或根本没有辍学。
您可以做的最后一件事是提高学习率,从1e-3之类的东西开始,看看是否有所改变。
希望我帮助您
您可以尝试测试其他优化器,然后尝试更改最后一层激活。我已经遇到了同样的问题,我在最后一个致密层中使用了SoftMax激活,我更改为sigmoid,效果很好。
一个好的策略正在修改模型的体系结构,添加更多层,更改辍学值等...
希望我能帮助你。祝你好运!