自定义keras层的 output_shape为None(或不能自动确定)

我正在构建一个自定义Keras层，它本质上是带有可训练的基本参数的softmax函数。当层自己工作时，当放置在顺序模型中时，model.summary()决定其输出形状，因为None和model.fit()引发一个可能相关的异常:

ValueError: as_list()在未知TensorShape上未定义。

在其他自定义层(包括keras的线性示例)中，可以在调用.build()后确定输出形状。通过查看model.summary()的源代码，以及keras.layers.Layer，有这个@属性Layer.output_shape无法自动确定输出形状。

然后我尝试重写属性并手动返回input_shape参数传递给我的图层的.build()方法保存后(softmax不改变输入的形状)，但这也不起作用:如果我调用super().output_shape before returning my value, model.summary()确定形状为?，而如果我不这样做，值可能显示似乎正确，但在这两种情况下，我得到完全相同的错误期间。

代码侧call()是否有一些特殊的东西阻止keras理解输出的形状?
或者，是否有我遗漏的文档?

我层:

class B_Softmax(keras.layers.Layer):
    def __init__(self, b_init_mean=10, b_init_var=0.001):
        super(B_Softmax, self).__init__()
        self.b_init = tf.random_normal_initializer(b_init_mean, b_init_var)
        self._out_shape = None
        
    def build(self, input_shape):
        self.b = tf.Variable(
            initial_value = self.b_init(shape=(1,), dtype='float32'),
            trainable=True
        )
        self._out_shape = input_shape
    def call(self, inputs):
        # This is an implementation of Softmax for batched inputs
        # where the factor b is added to the exponents
        nominators  = tf.math.exp(self.b * inputs)
        denominator = tf.reduce_sum(nominators, axis=1)
        denominator = tf.squeeze(denominator)
        denominator = tf.expand_dims(denominator, -1)
        s           = tf.divide(nominators, denominator)
        return s
    @property
    def output_shape(self):    # If I comment out this function, summary prints 'None'
        self.output_shape      # If I leave this line, summary prints '?' 
        return self._out_shape # If the above line is commented out, summary prints '10' (correctly)
                               # but the same error is triggered in all three cases

图层可以自己运行:

>>> A     = tf.constant([[1,2,3], [7,5,6]], dtype="float32")
>>> layer = B_Softmax(1.0)
>>> layer(A)
<tf.Tensor: shape=(2, 3), dtype=float32, numpy=
array([[0.08991686, 0.24461554, 0.6654676 ],
       [0.6654677 , 0.08991687, 0.24461551]], dtype=float32)>

但是当我试图将它包含在模型中时，摘要看起来不正确:

input_dim = 5
model = keras.Sequential([
        Dense(32, activation='relu', input_shape=(input_dim,)),
        Dense(num_classes, activation="softmax"),
        B_Softmax(1.0)
])
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_10 (Dense)            (None, 32)                192       
                                                                 
 dense_11 (Dense)            (None, 10)                330       
                                                                 
 b__softmax_18 (B_Softmax)   None  <-------------------1-------- "None", "?", or "10" (in a hacky way) may be printted           
                                                                 
=================================================================
Total params: 523
Trainable params: 523
Non-trainable params: 0

培训失败:

batch_size = 128
epochs = 15
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_split=0.1)

ValueError: in user code:
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1051, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1040, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1030, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 894, in train_step
        return self.compute_metrics(x, y, y_pred, sample_weight)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 987, in compute_metrics
        self.compiled_metrics.update_state(y, y_pred, sample_weight)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 480, in update_state
        self.build(y_pred, y_true)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 398, in build
        y_pred)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 526, in _get_metric_objects
        return [self._get_metric_object(m, y_t, y_p) for m in metrics]
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 526, in <listcomp>
        return [self._get_metric_object(m, y_t, y_p) for m in metrics]
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 548, in _get_metric_object
        y_p_rank = len(y_p.shape.as_list())
    ValueError: as_list() is not defined on an unknown TensorShape.

这并不能直接解决问题，而是回避它:而不是使用squeeze和expand_dims，前者似乎是Tensorflow跟踪的问题，我们在求和中使用keepdims=True来保持softmax分母的轴线正确对齐。

def call(self, inputs):
        # This is an implementation of Softmax for batched inputs
        # where the factor b is added to the exponents
        nominators  = tf.math.exp(self.b * inputs)
        denominator = tf.reduce_sum(nominators, axis=1, keepdims=True)
        s           = tf.divide(nominators, denominator)
        return s

可以说，使用内置的softmax: 会更好。

def call(self, inputs):
        return tf.nn.softmax(self.b * inputs)

你可以在图层子类中实现compute_output_shape方法:

def compute_output_shape(self, input_shape):
    return [(None, out_shape)]

其中out_shape包含输出的维度，或者您可以替换整个元组以获得您想要的任何输出形状。

我发现代码没有问题，我认为输入参数很重要，有一些注释是这样做的:

我使用数据集和model.fit()我也使用它没有分裂，这是因为我创建了一个记录的样本。
模型输入，我从input(5，)修改为input(1,5)，匹配数据集创建形状(这就是为什么我在摘要中添加了选择1)
Categorize、BatchSize、Number of class、LossFN和Optimizers根据网络的输出维度通过number_classes参数进行调整。

样本:& lt; & lt;Loss FN不决定模型，但Input and Output决定模型，没有必要告诉它是如何创建的，删除了它不使用的内容或注释它>>

import tensorflow as tf
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Class / Definition
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
class B_Softmax(tf.keras.layers.Layer):
    def __init__(self, b_init_mean=10, b_init_var=0.001):
        super(B_Softmax, self).__init__()
        self.b_init = tf.random_normal_initializer(b_init_mean, b_init_var)
        self._out_shape = None
        
    def build(self, input_shape):
        self.b = tf.Variable(
            initial_value = self.b_init(shape=(1,), dtype='float32'),
            trainable=True
        )
        self._out_shape = input_shape
    def call(self, inputs):
        # This is an implementation of Softmax for batched inputs
        # where the factor b is added to the exponents
        nominators  = tf.math.exp(self.b * inputs)
        denominator = tf.reduce_sum(nominators, axis=1)
        denominator = tf.squeeze(denominator)
        denominator = tf.expand_dims(denominator, -1)
        s           = tf.divide(nominators, denominator)
        return s
    # @property
    # def output_shape(self):    # If I comment out this function, summary prints 'None'
        # self.output_shape      # If I leave this line, summary prints '?' 
        # return self._out_shape # If the above line is commented out, summary prints '10' (correctly)
    #                           but the same error is triggered in all three cases
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Variables
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""                              
A = tf.constant([[1,2,3], [7,5,6]], dtype="float32")
batch_size = 128
epochs = 15
input_dim = 5
num_classes = 1
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Dataset
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
start = 3
limit = 16
delta = 3
sample = tf.range( start, limit, delta )
sample = tf.cast( sample, dtype=tf.float32 )
sample = tf.constant( sample, shape=( 1, 1, 1, 5 ) )
dataset = tf.data.Dataset.from_tensor_slices(( sample, tf.constant( [0], shape=( 1, 1, 1, 1 ), dtype=tf.int64)))
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Model Initialize
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""   
layer = B_Softmax(1.0)
print( layer(A) )
model = tf.keras.Sequential([
        tf.keras.layers.Dense(32, activation='relu', input_shape=(1, input_dim)),
        tf.keras.layers.Dense(num_classes, activation="softmax"),
        B_Softmax(1.0)
])
model.summary()
model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=["accuracy"])
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
: Working
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""   
model.fit(dataset, batch_size=batch_size, epochs=epochs, validation_data=dataset)

输出:

tf.Tensor(
[[0.09007736 0.24477491 0.6651477 ]
 [0.66514784 0.09007736 0.24477486]], shape=(2, 3), dtype=float32)
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #
=================================================================
 dense (Dense)               (None, 1, 32)             192
 dense_1 (Dense)             (None, 1, 1)              33
 b__softmax_1 (B_Softmax)    None                      1
=================================================================
Total params: 226
Trainable params: 226
Non-trainable params: 0
_________________________________________________________________
Epoch 1/15
1/1 [==============================] - 5s 5s/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 2/15
1/1 [==============================] - 0s 14ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 3/15
1/1 [==============================] - 0s 15ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 4/15
1/1 [==============================] - 0s 13ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 5/15
1/1 [==============================] - 0s 14ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 6/15
1/1 [==============================] - 0s 12ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 7/15
1/1 [==============================] - 0s 13ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 8/15
1/1 [==============================] - 0s 12ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 9/15
1/1 [==============================] - 0s 12ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 10/15
1/1 [==============================] - 0s 12ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 11/15
1/1 [==============================] - 0s 12ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 12/15
1/1 [==============================] - 0s 15ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 13/15
1/1 [==============================] - 0s 14ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 14/15
1/1 [==============================] - 0s 15ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
Epoch 15/15
1/1 [==============================] - 0s 14ms/step - loss: 0.0000e+00 - accuracy: 0.0000e+00 - val_loss: 0.0000e+00 - val_accuracy: 0.0000e+00
C:Python310>

相关内容

最新更新

热门标签：