多输出分类的类权值



我有一个问题,我创建了一个这样的模型:

from keras.models import Sequential, Model
from keras.layers import Dense,  Dropout, Flatten
from keras.layers import LSTM, Conv1D, Input, MaxPooling1D, GlobalMaxPooling1D
from keras.layers.embeddings import Embedding
posts_input = Input(shape=(None,), dtype='int32', name='posts')
embedded_posts = Embedding(max_nb_words, embedding_vector_length, input_length=max_post_len)(posts_input)
x = Conv1D(128, 5, activation='relu')(embedded_posts)
x = Dropout(0.25)(x)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = Conv1D(256, 5, activation='relu')(x)
x = Dropout(0.25)(x)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = Conv1D(256, 5, activation='relu')(x)
x = Dropout(0.25)(x)
x = GlobalMaxPooling1D()(x)
x = Dense(128, activation='relu')(x)
Axe1_prediction = Dense(1, activation='sigmoid', name='axe1')(x)
Axe2_prediction = Dense(1, activation='sigmoid', name='axe2')(x)
Axe3_prediction = Dense(1, activation='sigmoid', name='axe3')(x)
Axe4_prediction = Dense(1, activation='sigmoid', name='axe4')(x)
model = Model(posts_input, [Axe1_prediction, Axe2_prediction, Axe3_prediction, Axe4_prediction])

如你所见,这个模型有4个输出。

然后我像这样编译这个模型:

model.compile(optimizer='rmsprop', 
loss=['binary_crossentropy', 
'binary_crossentropy', 
'binary_crossentropy', 
'binary_crossentropy'],
metrics=['accuracy'])

为了拟合这个模型,我认为我需要设置类权重,所以我创建了这些:

from sklearn.preprocessing import LabelEncoder
from sklearn.utils import class_weight
le = LabelEncoder()
y1 = le.fit_transform(df2["Axe1"])
y2 = le.fit_transform(df2["Axe2"])
y3 = le.fit_transform(df2["Axe3"])
y4 = le.fit_transform(df2["Axe4"])
cw1 = class_weight.compute_class_weight('balanced', np.unique(y1), y1)
cw2 = class_weight.compute_class_weight('balanced', np.unique(y2), y2)
cw3 = class_weight.compute_class_weight('balanced', np.unique(y3), y3)
cw4 = class_weight.compute_class_weight('balanced', np.unique(y4), y4)

但是最后我不知道如何在拟合中设置这个参数:

history = model.fit(X_train, 
[y1_train, y2_train, y3_train, y4_train], 
epochs=10,
validation_data=(X_val, [y1_val, y2_val, y3_val, y4_val]));

你能告诉我如何添加"class_weights =">

参数?

您必须使用tensorflow 2.1或更早的版本。对于多输出模型,在TF2.1之后已经删除了类权重功能

如果你还想使用tensorflow>2.1,您需要定义一个自定义的损失,类似于下面的内容:

from functools import partial
import tensorflow as tf
import keras.backend as K

def weighted_binary_crossentropy(target, output, weights_table):
# get the given weight
weights_vect = weights_table.lookup(target)
return K.binary_crossentropy(target, output) * weights_vect

# transform dictionnary of weights into lookup table that can be used
def to_lookup_table(dictionnary):
return tf.lookup.StaticHashTable(
tf.lookup.KeyValueTensorInitializer(
list(dictionnary.keys()), #[0,1]
list(dictionnary.values()), # corresponding weights
key_dtype=tf.int64,
value_dtype=tf.float32,
),
default_value=-1)
cw1 = ...
cw2 = ...
cw3 = ...
cw4 = ...
# define function where weights_table already defined
binary_crossentropy_1 = partial(weighted_binary_crossentropy, weights_table=to_lookup_table(cw1))
...
binary_crossentropy_4 = partial(weighted_binary_crossentropy, weights_table=to_lookup_table(cw4))
model.compile(optimizer='rmsprop', 
loss=[binary_crossentropy_1, ..., binary_crossentropy_4],
metrics=['accuracy']) 

取决于您如何定义权重字典和模型的输出,您可能必须更改target的类型,其形状或删除argmax。您可能还需要更改to_lookup_table函数中的键和值的类型。对于最后需要分类交叉熵的人,只需将K.binary_crossentropy替换为K.categorical_crossentropy

最新更新