简单TensorFlow LSTM网络:ValueError:没有为任何变量提供梯度



我正试图创建一个简单的LSTM示例来预测浮点数字序列中的下一个点。为了简单起见,我选择了线性样本数据。我使用的是TensorFlow 2.3.1和NumPy 1.18.5。

以下是输入数据的设置,长度为5:的基本滚动范围

# Sample data -- linear function
linear_data=0.05+0.05*np.array(range(20))
# Size of window in rolling horizon
window = 5
# Tensorflow expects 3D: i,j,k = samples, window/sequence data, 
# features (one since we're univariate)
rolling_x = np.zeros([len(linear_data)-window, window, 1])
rolling_y = np.zeros([len(linear_data)-window, 1])
# Populate the rolling horizon data
for idx in [t for t in range(len(linear_data)-window)]:
rolling_x[idx, :, 0] = linear_data[idx:idx+window]
rolling_y[idx, :] = linear_data[idx+window]

print(rolling_x[:,:,0])
>>> [[0.05 0.1  0.15 0.2  0.25]
>>> [0.1  0.15 0.2  0.25 0.3 ]
>>> [0.15 0.2  0.25 0.3  0.35]
...
>>> [0.75 0.8  0.85 0.9  0.95]]

并且其中CCD_ 1包含序列中的下一个项目。我的理解是,数据的结构应该是第一个轴上有批次/样本数据,第二个轴上是序列/窗口,最后一个轴上的特征数量(在这种情况下是1,因为我是单变量的(。

从这里我建立了一个非常简单的模型:

# Build the model
tf_model = Sequential()
tf_model.add(LSTM(
units=32,
input_shape=[window, 1]
))
tf_model.add(Dense(units=1))
tf_model.compile()

它编译得很好,但当我尝试训练(tf_model.fit(rolling_x, rolling_y)(时,我会得到以下错误:

c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasenginetraining.py:806 train_function  *
return step_function(self, iterator)
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasenginetraining.py:796 step_function  **
outputs = model.distribute_strategy.run(run_step, args=(data,))
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythondistributedistribute_lib.py:1211 run
return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythondistributedistribute_lib.py:2585 call_for_each_replica
return self._call_for_each_replica(fn, args, kwargs)
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythondistributedistribute_lib.py:2945 _call_for_each_replica
return fn(*args, **kwargs)
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasenginetraining.py:789 run_step  **
outputs = model.train_step(data)
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasenginetraining.py:756 train_step
_minimize(self.distribute_strategy, tape, self.optimizer, loss,
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasenginetraining.py:2736 _minimize
gradients = optimizer._aggregate_gradients(zip(gradients,  # pylint: disable=protected-access
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasoptimizer_v2optimizer_v2.py:562 _aggregate_gradients
filtered_grads_and_vars = _filter_grads(grads_and_vars)
c:users____documentsdata_analyticslstm-demoenvlibsite-packagestensorflowpythonkerasoptimizer_v2optimizer_v2.py:1270 _filter_grads
raise ValueError("No gradients provided for any variable: %s." %
ValueError: No gradients provided for any variable: ['lstm_3/lstm_cell_3/kernel:0', 'lstm_3/lstm_cell_3/recurrent_kernel:0', 'lstm_3/lstm_cell_3/bias:0', 'dense_3/kernel:0', 'dense_3/bias:0'].

正如@xdurch0正确提到的,在编译模型时,我们必须指定训练配置优化器、损失和度量

请参考如下所示的编译参数进行培训。

compile(
optimizer='rmsprop', loss=None, metrics=None, loss_weights=None,
weighted_metrics=None, run_eagerly=None, **kwargs
)

为了社区的利益,我在这里张贴完整的工作代码

import tensorflow as tf
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
# Sample data -- linear function
linear_data=0.05+0.05*np.array(range(20))
# Size of window in rolling horizon
window = 5
# Tensorflow expects 3D: i,j,k = samples, window/sequence data, 
# features (one since we're univariate)
rolling_x = np.zeros([len(linear_data)-window, window, 1])
rolling_y = np.zeros([len(linear_data)-window, 1])
# Populate the rolling horizon data
for idx in [t for t in range(len(linear_data)-window)]:
rolling_x[idx, :, 0] = linear_data[idx:idx+window]
rolling_y[idx, :] = linear_data[idx+window]

print(rolling_x[:,:,0])
# Build the model
tf_model = Sequential()
tf_model.add(LSTM(
units=32,
input_shape=[window, 1]
))
tf_model.add(Dense(units=1))
tf_model.compile(optimizer='adam', loss='mae', metrics=['mae'])
tf_model.fit(rolling_x, rolling_y)

输出:

[[0.05 0.1  0.15 0.2  0.25]
[0.1  0.15 0.2  0.25 0.3 ]
[0.15 0.2  0.25 0.3  0.35]
[0.2  0.25 0.3  0.35 0.4 ]
[0.25 0.3  0.35 0.4  0.45]
[0.3  0.35 0.4  0.45 0.5 ]
[0.35 0.4  0.45 0.5  0.55]
[0.4  0.45 0.5  0.55 0.6 ]
[0.45 0.5  0.55 0.6  0.65]
[0.5  0.55 0.6  0.65 0.7 ]
[0.55 0.6  0.65 0.7  0.75]
[0.6  0.65 0.7  0.75 0.8 ]
[0.65 0.7  0.75 0.8  0.85]
[0.7  0.75 0.8  0.85 0.9 ]
[0.75 0.8  0.85 0.9  0.95]]
1/1 [==============================] - 0s 2ms/step - loss: 0.6889 - mae: 0.6889
<tensorflow.python.keras.callbacks.History at 0x7fa76e2b05f8>

最新更新