为什么我在TensorFlow NN的损失函数中出现这个形状错误



我正在进行一个NLP项目,从文本中获取情感。我正在使用此数据集:https://www.kaggle.com/datasets/praveengovi/emotions-dataset-for-nlp?select=train.txt

我一直收到这个错误:

logits and labels must have the same first dimension, got logits shape [16,6] and labels shape [96]

我的批量大小是16,所以标签形状是正确的大小,因为我对输出进行了热编码,并且有6个可能的类(6*16=96(。出于某种原因,网络正在改变标签的形状,我不知道这是在哪里发生的。

这是我的代码:

import numpy as np
import pandas as pd
import tensorflow as tf
import os
from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences
from sklearn import preprocessing
from keras.utils.np_utils import to_categorical  
from tensorflow.keras import layers
from tensorflow.keras import losses
training_size = 14000
val_size = 1000
BATCH_SIZE = 16
with open('/content/drive/MyDrive/KaggleDatasets/train.txt') as f:
contents = f.readlines()
split_txt = []
for i in range (len(contents)):
split_txt.append(contents[i].split(';'))
sentences = []
emotions = []
for i in range (len(contents)):
sentences.append(split_txt[i][0])
emotions.append(split_txt[i][1])
labels = np.array(emotions)
labels = labels.astype('str')
unique_labels = np.unique(labels)
print(unique_labels)
label_dict = {
'angern':0,
'fearn':1,
'joyn':2,
'loven':3,
'sadnessn':4,
'surprisen':5
}
#get labels from string to int
int_labels = []
for i in range(len(labels)):
int_labels.append(label_dict[labels[i]])
catagorical_labels = np.array(to_categorical(int_labels, num_classes = (len(unique_labels))))
sentences=np.array(sentences)
x_train = sentences[0:training_size]
x_val = sentences[training_size:training_size+val_size]
x_test = sentences[val_size:]
y_train = catagorical_labels[0:training_size]
y_val = catagorical_labels[training_size:training_size+val_size]
y_test = catagorical_labels[val_size:]
tokenizer = Tokenizer(num_words=500, oov_token = "<00V>")
tokenizer.fit_on_texts(x_train)
word_index = tokenizer.word_index
training_sequences = tokenizer.texts_to_sequences(x_train)
training_padded = pad_sequences(training_sequences, padding='post')
val_sequences = tokenizer.texts_to_sequences(x_val)
val_padded = pad_sequences(val_sequences, padding='post')
test_sequences = tokenizer.texts_to_sequences(x_test)
test_padded = pad_sequences(test_sequences, padding='post')
train_ds = tf.data.Dataset.from_tensor_slices((training_padded, y_train))
val_ds = tf.data.Dataset.from_tensor_slices((val_padded, y_val))
test_ds = tf.data.Dataset.from_tensor_slices((test_padded, y_test))
AUTOTUNE = tf.data.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
test_ds = test_ds.cache().prefetch(buffer_size=AUTOTUNE)
train_ds = train_ds.batch(batch_size=BATCH_SIZE)
val_ds = val_ds.batch(batch_size=BATCH_SIZE)
test_ds = test_ds.batch(batch_size=BATCH_SIZE)
vocab_size = len(word_index)
embed_dim = 32
max_length = training_padded.shape[1]
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embed_dim, input_length=max_length),
tf.keras.layers.GlobalMaxPooling1D(),
tf.keras.layers.Dense(20, activation='relu'),
tf.keras.layers.Dense(6, activation='softmax')
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
callbacks = [
tf.keras.callbacks.ReduceLROnPlateau(monitor='loss', patience=2, verbose=1),
tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5, verbose=1),
]
epochs=3
history = model.fit(
train_ds,
epochs=epochs,
validation_data=val_ds,
callbacks=callbacks
)

*我在损失函数计算时出错了

在这种情况下,y_true和y_pred应该是相同的形状。

我认为你需要将你的标签重塑为[6,16]张量。

请参阅以下文档以了解相同内容。

https://www.tensorflow.org/api_docs/python/tf/keras/losses/SparseCategoricalCrossentropy#:~:text=调用%20丢失%20实例

您需要使用CategoricalCrossentropy损失而不是SparseCategoricalCrossentropy

此外,填充验证序列的长度与训练序列的长度不同。

您可以使用maxlen参数使它们相等:

val_padded = pad_sequences(val_sequences, padding='post', maxlen=training_padded.shape[-1])

最新更新