用于语义分割的加权像素明智分类交叉熵



我最近开始学习语义分割。我正在尝试为同样的事情培训联合国网。我的输入是RGB 128x128x3图像。我的面罩由 4 类 0、1、2、3 组成,采用 One-Hot 编码,尺寸为 128x128x4。

def weighted_cce(y_true, y_pred):
weights = []
t_inf = tf.convert_to_tensor(1e9, dtype = 'float32')
t_zero = tf.convert_to_tensor(0, dtype = 'int64')
for i in range(0, 4):
l = tf.argmax(y_true, axis = -1) == i
n = tf.cast(tf.math.count_nonzero(l), 'float32') + K.epsilon()
weights.append(n)
weights = [batch_size/j for j in weights]
y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
# clip to prevent NaN's and Inf's
y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
# calc
loss = y_true * K.log(y_pred) * weights
loss = -K.sum(loss, -1)
return loss

这是我正在使用的损失函数,但它将每个像素分类为 2。我做错了什么?

您应该拥有基于整个数据的权重(除非您的批量大小相当大,因此您有稳定的权重(。

如果某个类的代表性不足,批量较小,它将具有接近无穷大的权重。

如果目标数据是 numpy 数组:

shp = y_train.shape
totalPixels = shp[0] * shp[1] * shp[2]
weights = np.sum(y_train, axis=(0, 1, 2)) #final shape (4,)
weights = totalPixels/weights           

如果数据位于Sequence生成器中:

totalPixels = 0
counts = np.zeros((4,))
for i in range(len(generator)):
x, y = generator[i]
shp = y.shape
totalPixels += shp[0] * shp[1] * shp[2]
counts = counts + np.sum(y, axis=(0,1,2))
weights = totalPixels / counts

如果您的数据在yield生成器中(您必须知道一个纪元中有多少批(:

for i in range(batches_per_epoch):
x, y = next(generator)
#the rest is equal to the Sequence example above

尝试 1

我不知道较新版本的 Keras 是否能够处理这个问题,但您可以先尝试最简单的方法:只需使用class_weight参数调用fitfit_generator

model.fit(...., class_weight = {0: weights[0], 1: weights[1], 2: weights[2], 3: weights[3]})

尝试 2

做一个更健康的损失函数:

weights = weights.reshape((1,1,1,4))
kWeights = K.constant(weights)
def weighted_cce(y_true, y_pred):
yWeights = kWeights * y_pred         #shape (batch, 128, 128, 4)
yWeights = K.sum(yWeights, axis=-1)  #shape (batch, 128, 128)  
loss = K.categorical_crossentropy(y_true, y_pred) #shape (batch, 128, 128)
wLoss = yWeights * loss
return K.sum(wLoss, axis=(1,2))

最新更新