MirroredStrategy导致IndexError:当使用Keras序列作为模型输入时，从空列表中弹出

虽然MirroredStrategy的IndexError: pop from empty list现在臭名昭著，并且有许多可能的原因，例如以下问题中所述：

由K.clear_session()引起的镜像策略IndexError
AutoKeras中的镜像策略IndexError
在检查点训练时镜像策略IndexError

等等，但都不适用于我的用例。

在我的用例中，我使用Keras Sequence对象来生成训练输入，因为我正在处理具有单个已知正类和未知负类的大型数据集(不适合RAM(。

根据Keras文档和TensorFlow文档中提供的教程，我的代码如下所示：


my_training_sequence = MySequenceObject()
if tf.config.list_physical_devices('GPU'):
strategy = tf.distribute.MirroredStrategy(devices)
else:
# Use the Default Strategy
strategy = tf.distribute.get_strategy()
with strategy.scope():
model = CreateMyKerasModel()
# While in the TensorFlow documentation the compilation step
# is shown OUTSIDE the scope, in the Keras one it happens
# within the scope.
# I  have found out that is NECESSARY to place it inside the scope
# as the Keras Metrics need to be in the same strategy scope of the model
# to work properly.
model.compile(...)
# Then, OUSIDE from the score, run the fit
# which causes the IndexError
model.fit(my_training_sequence)

关于如何处理这个问题有什么想法吗？

在经历了很多痛苦之后，我意识到在Keras文档中，他们使用了TensorFlow数据集对象。

现在，向量等正常输入在拟合过程中被转换为数据集，因此不会引起问题，但目前Keras不支持将Keras序列自动转换为后台数据集。虽然我不知道为什么会这样，但幸运的是，创建一个将序列转换为数据集的方法相对容易。

不幸的是，它取决于您使用的TensorFlow版本，因此在某些版本中，您希望使用TensorSpec对象，而在旧版本中，只需结合TensorFlow数据类型和TensorShape即可。

在下面的例子中，我将展示一种高级方法来编写可以转换为数据集的Keras Sequence类。之后，我将链接到我已经以这种方式实现的所有Keras序列，作为子孙后代的例子(或者我自己，一旦我忘记了这个魔鬼事件的一些细节(。

import tensorflow as tf
import numpy as np
from packaging import version
from validate_version_code import validate_version_code

def tensorflow_version_is_higher_or_equal_than(tensorflow_version: str) -> bool:
"""Returns boolean if the TensorFlow version is higher than provided one.
Parameters
----------------------
tensorflow_version: str,
The version of TensorFlow to check against.
Raises
----------------------
ValueError,
If the provided version code is not a valid one.
Returns
----------------------
Boolean representing if installed TensorFlow version is higher than given one.
"""
if not validate_version_code(tensorflow_version):
raise ValueError(
(
"The provided TensorFlow version code `{}` "
"is not a valid version code."
).format(tensorflow_version)
)
return version.parse(tf.__version__) >= version.parse(tensorflow_version)

class ExampleSequence:
"""Keras Sequence convertible into a TensorFlow Dataset."""
def __init__(
self,
batch_size: int = 32,
batches_per_epoch: int,
# Your other parameters go here
):
"""
Parameters
--------------------------------
batch_size: int = 32
Size for the batches to generate,
if the size is expected to be CONSTANT
otherwise use None if some batches have different size
batches_per_epoch: int
The number of batches within an epoch
"""
self._batch_size = batch_size
self._batches_per_epoch = batches_per_epoch
# Initialize the index of the batch for the Dataset calls
self._current_index = 0
# Your other parameters go here
def __call__(self):
"""Return next batch using an infinite generator model."""
self._current_index = (self._current_index + 1) % self._batches_per_epoch
return self[self._current_index]
def into_dataset(self) -> tf.data.Dataset:
"""Return dataset generated out of the current sequence instance.
Implementative details
---------------------------------
This method handles the conversion of this Keras Sequence into
a TensorFlow dataset, also handling the proper dispatching according
to what version of TensorFlow is installed in this system.
Returns
----------------------------------
Dataset to be used for the training of a model
"""
#################################################################
# Handling kernel creation when TensorFlow is a modern version. #
#################################################################
if tensorflow_version_is_higher_or_equal_than("2.5.0"):
return tf.data.Dataset.from_generator(
self,
output_signature=(
(
tf.TensorSpec(
shape=(self._batch_size, 10),
dtype=tf.uint32
)
),
tf.TensorSpec(
shape=(self._batch_size,),
dtype=tf.bool
)
)
)
return tf.data.Dataset.from_generator(
self,
output_types=(
(tf.uint32, ),
tf.bool
),
output_shapes=(
(tf.TensorShape([self._batch_size, 10]),),
tf.TensorShape([self._batch_size, ]),
)
)
def __getitem__(self, idx: int):
"""Return batch corresponding to given index.
Parameters
---------------
idx: int,
Index corresponding to batch to be returned.
Returns
---------------
Return Tuple containing X and Y numpy arrays corresponding to given batch index.
"""
X = np.random.randint(shape=(self._batch_size, 10), dtype=np.uint32)
y = np.random.randint(high=2, shape=(self._batch_size, ), dtype=np.bool)
# Please do observe that the return type
# has multiple layer of tuple wrapping, and they are ALL needed!
# It is weird, but it is the only way this thing worked.
return (((X, ), y,),)

然后，当你进行拟合时，你可以使用：

model.fit(my_training_sequence.into_dataset())

相关内容

最新更新

热门标签：