我得到一个形状/值错误。我的语言处理模型的神经网络如下:
## inputs
idx = layers.Input((50), dtype="int32", name="input_idx")
masks = layers.Input((50), dtype="int32", name="input_masks")
segments = layers.Input((50), dtype="int32", name="input_segments")
## pre-trained bert
nlp = transformers.TFBertModel.from_pretrained("bert-base-uncased")
bert_out, _ = nlp([idx, masks, segments])
## fine-tuning
x = layers.GlobalAveragePooling1D()(bert_out)
x = layers.Dense(64, activation="relu")(x)
date_out = layers.Dense(len(np.unique(date_train)),
activation='softmax')(x)
## compile
model = models.Model([idx, masks, segments], date_out)
for layer in model.layers[:4]:
layer.trainable = False
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.summary()
模型的输入形状为(None, 50)。
model.summary()给出如下输出:
Model: "model_1"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==============================================================
====================================
input_idx (InputLayer) [(None, 50)] 0
__________________________________________________________________________________________________
input_masks (InputLayer) [(None, 50)] 0
__________________________________________________________________________________________________
input_segments (InputLayer) [(None, 50)] 0
__________________________________________________________________________________________________
tf_bert_model_3 (TFBertModel) ((None, 50, 768), (N 109482240 input_idx[0][0]
input_masks[0][0]
input_segments[0][0]
______________________________________________________________________________________
____________
global_average_pooling1d_2 (Glo (None, 768) 0 tf_bert_model_3[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 64) 49216 global_average_pooling1d_2[0][0]
__________________________________________________________________________________________________
dense_4 (Dense) (None, 13770) 895050 dense_3[0][0]
==================================================================================================
Total params: 110,426,506
Trainable params: 944,266
Non-trainable params: 109,482,240
_______________________________________________________________________________________
模型拟合的代码为:
## encode y
dic_y_mapping = {n:label for n,label in
enumerate(np.unique(date_train))}
inverse_dic = {v:k for k,v in dic_y_mapping.items()}
date_train = np.array([inverse_dic[date] for date in date_train])
## train
training = model.fit(x=X_train, y=date_train, batch_size=64,
epochs=1, shuffle=True, verbose=1,
validation_split=0.3)
## test
predicted_prob = model.predict(X_test)
predicted = [dic_y_mapping[np.argmax(pred)] for pred in predicted_prob]
在执行时,我面临以下错误:
ValueError: Input 0 is incompatible with layer model_1: expected shape=(None, 50), found shape=(None, 52)
有人能帮我解决这个问题吗?
说明模型的输入维度是52而不是50。看看你的代码,这将是你给model.fit()
作为x
,即X_train
。
因此,要么将X_train
的维数改为50。或者将模型定义中的50改为52。