由于保存模型而导致的训练崩溃:"tensorflow.GraphDef was modified concurrently during serialization"



我目前正在尝试训练一个模型,我的输入管道是在这里的答案中构建的。我想在每个纪元之后保存我的模型。但是在训练了几个时代之后,训练崩溃了。我读过这是因为它将输入作为常量张量添加到图形中。这里有建议的解决方案,可以使用tf.placeholder来解决问题。不幸的是,它不能为我解决问题。输入管道如下所示

....
filenames = [P_1]
dataset = tf.data.TFRecordDataset(filenames)
def _parse_function(example_proto):
keys_to_features = { 'data':tf.VarLenFeature(tf.float32)},
parsed_features = tf.parse_single_example(example_proto,  keys_to_features)
return tf.sparse_tensor_to_dense(parsed_features['data'
# Parse the record into tensors.
dataset = dataset.map(_parse_function)
# Shuffle the dataset
dataset = dataset.shuffle(buffer_size=1000)
# Repeat the input indefinitly 
dataset = dataset.repeat()      
# Generate batches     
dataset = dataset.batch(Batch_size) 
# Create a one-shot iterator
iterator = dataset.make_one_shot_iterator()
data = iterator.get_next()   
....
for i in range(epochs):
for ii in range(iteration):
image = sess.run(data)
....
saver.save(sess, 'filename')

错误消息如下所示

[libprotobuf FATAL external/protobuf_archive/src/google/protobuf/message_lite.cc:68] CHECK failed: (byte_size_before_serialization) == (byte_size_after_serialization): tensorflow.GraphDef was modified concurrently during serialization.
terminate called after throwing an instance of 'google::protobuf::FatalException'  
what():  CHECK failed: (byte_size_before_serialization) == (byte_size_after_serialization): tensorflow.GraphDef was modified concurrently during serialization.
Aborted

问题似乎出在_parse_function中。确保分析器在创建 TFrecord 文件时以相同的方式执行。例如,如果它们具有相同的数据类型左右

最新更新