我现在正在研究一个神经网络,该网络应该可以预测下一个活动和跟踪结果(事件序列,取自事件日志(。首先,我用一个额外的事件扩展了每个跟踪,该事件指示跟踪的结果作为其活动(例如,跟踪t的标签为o1,然后它成为跟踪的最后一个刚添加事件的活动(特别是,我首先将每个唯一的活动编码为一个整数(然后每个跟踪都是一个整数数组(。然后,每个跟踪都扩展了一个额外的事件,该事件指示作为其活动的结果(例如,跟踪t的标签为outcome1,然后变成然后,每个轨迹被划分在固定维度的窗口中,这将是神经网络的输入。例如,对于被编码为[1 2 2 3…24 24 25 26 27](其中27是结果的编码(的跟踪[a b b c…x x y z o](其中o是结果(,并且窗口的固定维度等于4,则得到的窗口将是[0 0 a][0 a b][0 a a b b][a b b b b c]。。。[x x y z]编码为[0 0 0 1][0 0 1 2][0 1 2 2][1 2 2 3]。。。[24 24 25 26]:正如你所看到的,结果不包括在窗口中,因为它们必须被预测。
目标数据将是:
- 对于下一个活动预测,一个包含每个窗口的下一活动的数组(按照上面的例子,它将被编码为[2 2 3…27]
- 对于结果预测,一个包含每个窗口结果的数组(在我们的情况下,它将是一个o或27的数组:然而,由于将考虑多个轨迹,将产生具有不同结果的多个窗口。(
这里我展示了这些数据的一些示例:活动地图(前两项活动为结果(
{'regular': 1, 'deviant': 2, 'Round Grinding - Machine 3': 3, 'Round Grinding - Machine 2': 4, 'Grinding Rework - Machine 27': 5, 'Lapping - Machine 1': 6, 'other': 7, 'Turning & Milling Q.C.': 8, 'Laser Marking - Machine 7': 9, 'Round Grinding - Q.C.': 10, 'Turning & Milling - Machine 4': 11, 'Final Inspection Q.C.': 12, 'Packing': 13, 'Turning & Milling - Machine 8': 14, 'Flat Grinding - Machine 11': 15, 'Round Grinding - Manual': 16, 'Wire Cut - Machine 13': 17, 'Turning & Milling - Machine 9': 18, 'Milling - Machine 16': 19, 'Turning - Machine 8': 20, 'Turning Q.C.': 21, 'Turning & Milling - Machine 5': 22, 'Turning & Milling - Machine 10': 23, 'Turning & Milling - Machine 6': 24, 'Round Grinding - Machine 12': 25, 'Turning - Machine 9': 26, 'Milling - Machine 14': 27, 'Turn & Mill. & Screw Assem - Machine 10': 28}
INPUT DATA(变量x_training((记录道的编码窗口((这里只有3个记录道(
[array([0., 0., 0., 3.]), array([0., 0., 3., 4.]), array([0., 3., 4., 5.]), array([3., 4., 5., 5.]), array([4., 5., 5., 6.]), array([5., 5., 6., 6.]), array([5., 6., 6., 5.]), array([6., 6., 5., 7.]), array([6., 5., 7., 7.]), array([5., 7., 7., 7.]), array([7., 7., 7., 7.]), array([7., 7., 7., 7.]), array([7., 7., 7., 5.]), array([7., 7., 5., 5.]), array([7., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 5.]), array([5., 5., 5., 8.]), array([5., 5., 8., 6.]), array([5., 8., 6., 9.]), array([8., 6., 9., 3.]), array([ 6., 9., 3., 10.]), array([ 0., 0., 0., 11.]), array([ 0., 0., 11., 11.]), array([ 0., 11., 11., 11.]), array([11., 11., 11., 11.]), array([11., 11., 11., 11.]), array([11., 11., 11., 11.]), array([11., 11., 11., 8.]), array([11., 11., 8., 11.]), array([11., 8., 11., 11.]), array([ 8., 11., 11., 8.]), array([11., 11., 8., 11.]), array([11., 8., 11., 11.]), array([ 8., 11., 11., 11.]), array([11., 11., 11., 8.]), array([11., 11., 8., 11.]), array([0., 0., 0., 6.]), array([0., 0., 6., 3.]), array([0., 6., 3., 3.]), array([6., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 6.]), array([3., 3., 6., 3.]), array([3., 6., 3., 3.]), array([6., 3., 3., 3.]), array([3., 3., 3., 3.]), array([3., 3., 3., 3.]), array([ 3., 3., 3., 12.]), array([ 3., 3., 12., 12.]), array([ 3., 12., 12., 12.]), array([12., 12., 12., 12.]), array([12., 12., 12., 13.]), array([12., 12., 13., 12.]), array([12., 13., 12., 12.]), array([13., 12., 12., 3.]), array([12., 12., 3., 12.]), array([12., 3., 12., 12.]), array([ 3., 12., 12., 12.]), array([12., 12., 12., 12.]), ... (and so on) ]
TARGET DATA(NEXT ACTIVATION((变量y_training((将此处的每个整数视为其+1,因为我使用了标签编码器fit_transform(。
[ 3 4 4 5 5 4 6 6 6 6 6 4 4 4 4 4 4 4 4 4 4 4 4 4
4 4 4 4 7 5 8 2 9 0 10 10 10 10 10 7 10 10 7 10 10 10 7 10
1 2 2 2 2 2 2 2 2 2 5 2 2 2 2 2 11 11 11 11 12 11 11 2
11 11 11 11 1 ... (and so on)]
TARGET DATA(OUTCOMES((变量z_training((对于这个给定的数据集,结果是二进制的,但并不总是这样((这里也将每个结果视为其+1,因为我使用了标签编码器fit_transform(。
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 ... (and so on)]
还考虑到,然后我使y_训练和z_training都是分类的。
在这里你可以找到我建立的神经网络:
self.x_training = np.asarray(self.x_training)
outsize_act = len(np.unique(self.y_training))
outsize_out= len(np.unique(self.z_training))
self.y_training = to_categorical(self.y_training)
self.z_training = to_categorical(self.z_training)
unique_events = len(self.act_dictionary)
X_train, X_val, Y_train, Y_val, Z_train, Z_val = train_test_split(self.x_training, self.y_training, self.z_training, test_size=0.2,random_state=42, shuffle=True)
size_act = (unique_events + 1) // 2
input_act = Input(shape=(self.example_size,), dtype='int32', name='input_act')
x_act = Embedding(output_dim=size_act, input_dim=unique_events + 1, input_length=self.example_size)(input_act)
l1 = LSTM(16, return_sequences=True, kernel_initializer='glorot_uniform')(x_act)
b1 = BatchNormalization()(l1)
l2_1 = LSTM(16, return_sequences=False, kernel_initializer='glorot_uniform')(b1) # the layer specialized in activity prediction
b2_1 = BatchNormalization()(l2_1)
l2_2 = LSTM(16, return_sequences=False, kernel_initializer='glorot_uniform')(b1) #the layer specialized in outcome prediction
b2_2 = BatchNormalization()(l2_2)
output_l = Dense(outsize_act, activation='softmax', name='act_output')(b2_1)
output_o = Dense(outsize_out, activation='softmax', name='outcome_output')(b2_2)
model = Model(inputs=input_act, outputs=[output_l, output_o])
print(model.summary())
opt = Adam()
model.compile(loss={'act_output':'categorical_crossentropy', 'outcome_output':'categorical_crossentropy'}, optimizer=opt, metrics=['accuracy'])
early_stopping = EarlyStopping(monitor='val_loss', patience=42)
model_checkpoint = ModelCheckpoint('output_files/models/model_{epoch:02d}-{val_loss:.2f}.h5', monitor='val_loss', verbose=0, save_best_only=True,save_weights_only=False, mode='auto')
lr_reducer = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=10, verbose=0, mode='auto', min_delta=0.0001, cooldown=0, min_lr=0)
model.fit(X_train, {'act_output':Y_train, 'outcome_output':Z_train}, epochs=200, batch_size=128, verbose=2, callbacks=[early_stopping, model_checkpoint, lr_reducer], validation_data=(X_val, Y_val,Z_val))
model.save("model/generate_" + self.log_name + ".h5")
在这里你可以找到我得到的错误:
Epoch 1/200
Traceback (most recent call last):
File "C:Users...manager.py", line 244, in build_neural_network_model
model.fit(X_train, {'act_output':Y_train, 'outcome_output':Z_train}, epochs=200, batch_size=128, verbose=2, callbacks=[early_stopping, model_checkpoint, lr_reducer],
File "C:Users...AppDataRoamingPythonPython39site-packageskerasutilstraceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:Users...AppDataRoamingPythonPython39site-packagestensorflowpythonframeworkfunc_graph.py", line 1147, in autograph_handler
raise e.ag_error_metadata.to_exception(e)
ValueError: in user code:
File "C:Users...AppDataRoamingPythonPython39site-packageskerasenginetraining.py", line 1525, in test_function *
return step_function(self, iterator)
File "C:Users...AppDataRoamingPythonPython39site-packageskerasenginetraining.py", line 1514, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "C:Users...AppDataRoamingPythonPython39site-packageskerasenginetraining.py", line 1507, in run_step **
outputs = model.test_step(data)
File "C:Users...AppDataRoamingPythonPython39site-packageskerasenginetraining.py", line 1473, in test_step
self.compute_loss(x, y, y_pred, sample_weight)
File "C:Users...AppDataRoamingPythonPython39site-packageskerasenginetraining.py", line 918, in compute_loss
return self.compiled_loss(
File "C:Users...AppDataRoamingPythonPython39site-packageskerasenginecompile_utils.py", line 201, in __call__
loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "C:Users...AppDataRoamingPythonPython39site-packageskeraslosses.py", line 142, in __call__
return losses_utils.compute_weighted_loss(
File "C:Users...AppDataRoamingPythonPython39site-packageskerasutilslosses_utils.py", line 321, in compute_weighted_loss
losses, _, sample_weight = squeeze_or_expand_dimensions( # pylint: disable=unbalanced-tuple-unpacking
File "C:Users...AppDataRoamingPythonPython39site-packageskerasutilslosses_utils.py", line 211, in squeeze_or_expand_dimensions
sample_weight = tf.squeeze(sample_weight, [-1])
ValueError: Can not squeeze dim[1], expected a dimension of 1, got 2 for '{{node categorical_crossentropy/weighted_loss/Squeeze}} = Squeeze[T=DT_FLOAT, squeeze_dims=[-1]](IteratorGetNext:2)' with input shapes: [?,2].
因此,我请求您的帮助:我已经在谷歌上搜索并找到了许多类似的线程,但没有一个解决方案有效。我想这是因为我正在研究一个双输出神经网络。这是我第一次研究神经网络,也许有一些明显的错误我没有得到。谢谢你的帮助。
@Stefano,很高兴知道您修复了错误,感谢分享。
在回答部分添加Stefano(用户(评论以造福社区:
已解决。问题出现在中的参数
validation_data
中model.fit()
,本应为validation_data = (X_val, [Y_val,Z_val])
快乐编码!