Keras中的动态RNN:使用自定义RNN单元在每个时间步跟踪其他输出



是否有一种方法可以在keras中为RNN实现自定义单元格时返回给定时间步的多个输出?例如带有形状的输出:(sequences=[batch, timesteps, hidden_units], other_outputs=[batch, timesteps, arbitrary_units], last_hidden_states=[batch, hidden_units])

我的动机源于算法1的"循环解码器"的"变分顺序学习中自我注意的总结"。(Chien, ISCA 2019),它"累积变分目标",因此必须跟踪给定循环时间步长的多个输出。

对于keras RNN,如果在实例化层时传递return_sequences=Truereturn_state=True参数,则通过RNN的前向传递的输出分别是([batch, timesteps, hidden_units], [batch, hidden_units]),它们是所有时间步长和最后一个隐藏状态的隐藏状态。我想使用RNN在每个时间步跟踪其他输出,但我不知道怎么做。我想我可以改变自定义单元格中的output_size属性,类,但我不确定这是有效的,因为TensorFlow RNN文档似乎表明每个时间步只有一个输出是可能的(即"单个整数或TensorShape"):

output_size属性。它可以是单个整数或aTensorShape,它表示输出的形状。为了向后兼容原因,如果此属性不可用于单元格,该值将由state_size的第一个元素推断。

这是迄今为止我为自定义实现的"RNN细胞"所做的:

class CustomGRUCell(tf.keras.layers.Layer):
def __init__(self, units, arbitrary_units, **kwargs):
super().__init__(**kwargs)
self.units = units
# Custom computation for a timestep t
self.dense = tf.keras.layers.Dense(units=arbitrary_units)
# The RNN cell
self.gru = tf.keras.layers.GRUCell(units=self.units)
# Required for custom cells...
self.state_size = tf.TensorShape([self.units])
# PERHAPS I CHANGE THIS????
self.output_size = tf.TensorShape([self.units])
def call(self, input_at_t, states_at_t):
"""Forward pass that uses a constant to modify the hidden state.

:param inputs_at_t: (batch, features) tensor from (batch, t, features)
inputs
:param states_at_t: <class 'tuple'> Why? Perhaps generically,
this is because an LSTM for example takes two hidden states
instead of just one like the GRU
:param constants: <class 'tuple'> Why? To accomodate multiple
constants
"""
# Standard GRU cell call
output_at_t, states_at_t_plus_1 = self.gru(input_at_t, states_at_t)
# Another output at particular timestep t
special_output_at_t = self.dense(input_at_t)
# The outputs
# 'output_at_t' will be automatically tracked by 'return_sequences'.... how do I track
# other comptuations at each timestep????
return [output_at_t, special_output_at_t], states_at_t_plus_1

然后我想让单元格像这样工作:

# Custom cell and rnn
custom_cell = CustomGRUCell(units=10, arbitrary_units=5)
custom_rnn = tf.keras.layers.RNN(cell=custom_cell, return_sequences=True, return_state=True)
# Arbitrary data
batch = 4
timesteps = 6
features = 8
dummy_data = tf.random.normal(shape=(batch, timesteps, features))
# The output I want
seqs, special_seqs, last_hidden_state = custom_rnn(inputs=dummy_data)
print('batch, timesteps, units):', seqs.shape)
print('batch, timesteps, arbitrary_units:', special_seqs.shape)
print('batch, units:', last_hidden_state.shape)
>>> batch, timesteps, units : (4, 6, 10) 
>>> batch, timesteps, arbitrary_units: (4, 6, 5)
>>> batch, units: (4, 10)

明白了。你可以将输出大小设置为任意维度的列表,然后RNN将跟踪输出。下面的类还包括在RNN调用中使用常量,因为前面提到的论文将编码器潜在空间(z_enc)传递给循环解码器:

class CustomMultiTimeStepGRUCell(tf.keras.layers.Layer):
"""Illustrates multiple sequence like (n, timestep, size) outputs."""
def __init__(self, units, arbitrary_units, **kwargs):
"""Defines state for custom cell.

:param units: <class 'int'> Hidden units for the RNN cell.
:param arbitrary_units: <class 'int'> Hidden units for another
dense network that outputs a tensor at each timestep in the
unrolling of the RNN.
"""
super().__init__(**kwargs)
# Save args
self.units = units
self.arbitrary_units = arbitrary_units
# Standard recurrent cell
self.gru = tf.keras.layers.GRUCell(units=self.units)
# For use with 'constant' kwarg in 'call' method
self.concatenate = tf.keras.layers.Concatenate()
self.dense_proj = tf.keras.layers.Dense(units=self.units)
# For arbitrary computation at timestep t
self.other_output = tf.keras.layers.Dense(units=self.arbitrary_units)
# Hidden state size (i.e., h_t)...
# it's useful to know in general that this refers to the following:
# 'gru_cell = tf.keras.GRUCell(units=state_size)' 
# 'seq, h_t = gru_cell(data)'
# 'h_t.shape' -> '(?, state_size)'
self.state_size = tf.TensorShape([self.units])
# OUTPUT SIZE: PROBLEM SOLVED!!!!
# This is the last dimension of the RNN sequence output.
# Typically the last dimension matches the dimension of 
# self.state_size, and in fact the keras RNN will infer 
# the output size based on state size if output size is not
# specified. In the case of output size that does not match the 
# state size, you have to specify and in list format if 
# multiple outputs can occur per timestep in the RNN.
self.output_size = [tf.TensorShape([self.units]), tf.TensorShape([self.arbitrary_units])]
def call(self, input_at_t, states_at_t, constants):
"""Forward pass for custom RNN cell.

:param inputs_at_t: (batch, features) tensor from (batch, t, features)
inputs
:param states_at_t: <class 'tuple'> that has 1 element if
if using GRUCell (h_t), or 2 elements if using LSTMCell (h_t, c_t)
:param constants: <class 'tuple'> Unchanging tensors to be used
in the unrolling of the RNN.
:return: <class 'tuple'> with two elements.
(1) <class 'list'> Both elements of this list are tensors
that are tracked for each timestep in the unrolling of the RNN. 
(2) Tensor representing the hidden state passed to the next
cell.
In the brief graphic below, a_t denotes the arbitrary output
at each timestep. y_t = h_t_plus_1. x_t is some input at
timestep t.
a_t  y_t
^    ^
__|____|
h_t    |      | h_t_plus_1
-----> |      | ----------> .....
|______|
^
|
x_t

When all timesteps in x where x = {x_t}_{t=1}^{T} are processed
by the RNN, the resulting shapes of the outputs assuming there 
is only a single sample (batch = 1) would be the following:
Y = (1, timesteps, units)
A = (1, timesteps, arbitrary_units)
h_t_plus_1 = (1, units)  # Last hidden state

For a concrete example, see the end of this codeblock.
"""
# Get correct inputs -- by default these args are tuples...
# so you must index 0 to get the relevant element.
# Note, if you are using LSTM, then the hidden states passed to the
# the next cell in the RNN will be a tuple with two elements
# i.e., (h_t, c_t) for the hidden and cell state, respectively.
states_at_t = states_at_t[0]
z_enc = constants[0]
# Combine the states with z_enc
combined = self.concatenate([states_at_t, z_enc])
# Project to dimensions for GRU cell
special_states_at_t = self.dense_proj(combined)
# Standard GRU call
output_at_t, states_at_t_plus_1 = self.gru(input_at_t, special_states_at_t)
# Get another output at t
arbitrary_output_at_t = self.other_output(input_at_t)
# The outputs
return [output_at_t, arbitrary_output_at_t], states_at_t_plus_1
# Dims
batch = 4
timesteps = 3
features = 12
latent = 8
hidden_units = 10
arbitary_units = 15
# Data
inputs = tf.random.normal(shape=(batch, timesteps, features))
h_t = tf.zeros(shape=(batch, hidden_units))
z_enc = tf.random.normal(shape=(batch, latent))
# An RNN cell to test multitimestep outputs
custom_multistep_cell = CustomMultiTimeStepGRUCell(units=hidden_units, arbitrary_units=arbitary_units)
custom_multistep_rnn = tf.keras.layers.RNN(custom_multistep_cell, return_sequences=True, return_state=True)
# Call cell
outputs, special_outputs, last_hidden = custom_multistep_rnn(inputs, initial_state=h_t, constants=z_enc)
print(outputs.shape)
print(special_outputs.shape)
print(last_hidden.shape)
>>> (4, 3, 10)
>>> (4, 3, 15)
>>> (4, 10)

最新更新