KERAS:如何保存历史记录对象的训练历史记录属性



在Keras中,我们可以将model.fit的输出返回到历史记录,如下所示:

 history = model.fit(X_train, y_train, 
                     batch_size=batch_size, 
                     nb_epoch=nb_epoch,
                     validation_data=(X_test, y_test))

现在,如何将历史记录对象的历史记录属性保存到文件中以供进一步使用(例如,绘制ACC或针对时期的损失图)?

我使用的是:

with open('/trainHistoryDict', 'wb') as file_pi:
    pickle.dump(history.history, file_pi)

以这种方式,我将历史记录保存为词典,以防以后要绘制损失或准确性。稍后,当您想再次加载历史记录时,可以使用:

with open('/trainHistoryDict', "rb") as file_pi:
    history = pickle.load(file_pi)

为什么选择泡菜而不是json?

该答案下的评论准确地指出:

[将历史记录作为JSON]在Tensorflow Keras中不再起作用。我有以下问题:typeError:类型" float32"的对象不是JSON序列化。

有多种方法可以告诉json如何编码numpy对象,您可以从另一个问题中学到这些对象,因此在这种情况下,使用json没有错,这比简单地将其简单地倾倒到Pickle文件更为复杂。

另一种方法:

作为history.historydict,您也可以将其转换为pandas DataFrame对象,然后可以保存以适合您的需求。

逐步:

import pandas as pd
# assuming you stored your model.fit results in a 'history' variable:
history = model.fit(x_train, y_train, epochs=10)
# convert the history.history dict to a pandas DataFrame:     
hist_df = pd.DataFrame(history.history) 
# save to json:  
hist_json_file = 'history.json' 
with open(hist_json_file, mode='w') as f:
    hist_df.to_json(f)
# or save to csv: 
hist_csv_file = 'history.csv'
with open(hist_csv_file, mode='w') as f:
    hist_df.to_csv(f)

最简单的方法:

保存:

np.save('my_history.npy',history.history)

加载:

history=np.load('my_history.npy',allow_pickle='TRUE').item()

然后,历史记录是一个字典,您可以使用键检索所有所需的值。

model历史记录可以如下保存到文件中

import json
hist = model.fit(X_train, y_train, epochs=5, batch_size=batch_size,validation_split=0.1)
with open('file.json', 'w') as f:
    json.dump(hist.history, f)

history对象的 history字段是一个词典,它可以容纳在每个培训时期跨越不同的训练指标。所以例如history.history['loss'][99]将在第100个训练时期返回模型的损失。为了节省您可以 pickle此字典或简单保存该字典的不同列表到适当的文件。

我遇到了一个问题,即keras中列表中的值不可用。因此,我为我的使用原因写了这两个方便的功能。

import json,codecs
import numpy as np
def saveHist(path,history):
    
    new_hist = {}
    for key in list(history.history.keys()):
        new_hist[key]=history.history[key]
        if type(history.history[key]) == np.ndarray:
            new_hist[key] = history.history[key].tolist()
        elif type(history.history[key]) == list:
           if  type(history.history[key][0]) == np.float64:
               new_hist[key] = list(map(float, history.history[key]))
            
    print(new_hist)
    with codecs.open(path, 'w', encoding='utf-8') as file:
        json.dump(new_hist, file, separators=(',', ':'), sort_keys=True, indent=4) 
def loadHist(path):
    with codecs.open(path, 'r', encoding='utf-8') as file:
        n = json.loads(file.read())
    return n

Save Histion只需要获取应保存JSON文件的路径,而历史记录对象则从KERAS fitfit_generator方法返回。

我敢肯定有很多方法可以做到这一点,但是我四处摆弄并提出了自己的版本。

首先,自定义回调可以在每个时期结束时抓取并更新历史记录。在那里,我也有一个回调来保存模型。这两个都方便

class LossHistory(Callback):
    
    # https://stackoverflow.com/a/53653154/852795
    def on_epoch_end(self, epoch, logs = None):
        new_history = {}
        for k, v in logs.items(): # compile new history from logs
            new_history[k] = [v] # convert values into lists
        current_history = loadHist(history_filename) # load history from current training
        current_history = appendHist(current_history, new_history) # append the logs
        saveHist(history_filename, current_history) # save history from current training
model_checkpoint = ModelCheckpoint(model_filename, verbose = 0, period = 1)
history_checkpoint = LossHistory()
callbacks_list = [model_checkpoint, history_checkpoint]

第二,这里有一些"助手"功能,可以准确地做他们说的事情。这些都是从LossHistory()回调中调用的。

# https://stackoverflow.com/a/54092401/852795
import json, codecs
def saveHist(path, history):
    with codecs.open(path, 'w', encoding='utf-8') as f:
        json.dump(history, f, separators=(',', ':'), sort_keys=True, indent=4) 
def loadHist(path):
    n = {} # set history to empty
    if os.path.exists(path): # reload history if it exists
        with codecs.open(path, 'r', encoding='utf-8') as f:
            n = json.loads(f.read())
    return n
def appendHist(h1, h2):
    if h1 == {}:
        return h2
    else:
        dest = {}
        for key, value in h1.items():
            dest[key] = value + h2[key]
        return dest

之后,您只需要将history_filename设置为data/model-history.json之类的东西,并将model_filename设置为data/model.h5之类的东西。最终的调整以确保在培训结束时不要弄乱您的历史,假设您停下来启动并贴在回调中,就是这样做:

new_history = model.fit(X_train, y_train, 
                     batch_size = batch_size, 
                     nb_epoch = nb_epoch,
                     validation_data=(X_test, y_test),
                     callbacks=callbacks_list)
history = appendHist(history, new_history.history)

随时随地,history = loadHist(history_filename)将您的历史记录回去。

funkiness来自JSON和列表,但我无法通过迭代进行转换而无法转换它。无论如何,我知道这起作用是因为我已经摇动了几天。https://stackoverflow.com/a/444674337/852795的pickle.dump答案可能更好,但我不知道那是什么。如果我错过了这里的任何东西,或者您无法正常工作,请告诉我。

您可以保存 .txt form

tf.keras.callbacks.History的历史属性
with open("./result_model.txt",'w') as f:
    for k in history.history.keys():
        print(k,file=f)
        for i in history.history[k]:
            print(i,file=f)

在训练过程结束时保存历史记录时,上述答案很有用。如果您想在培训期间保存历史记录,则CSVlogger回调将有所帮助。

下面的代码以数据表文件的形式保存模型重量和历史培训 log.csv

model_cb = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path)
history_cb = tf.keras.callbacks.CSVLogger('./log.csv', separator=",", append=False)
history = model.fit(callbacks=[model_cb, history_cb])

这是一个将日志腌入文件的回调。实例化回调OBJ时提供模型文件路径;这将创建一个关联的文件 - 给定的模型路径'/home/user/model.h5',腌制路径'/home/home/user/user/model_history_pickle'。重新加载模型后,回调将继续从其在。

的时代。
 
    import os
    import re
    import pickle
    #
    from tensorflow.keras.callbacks import Callback
    from tensorflow.keras import backend as K
    class PickleHistoryCallback(Callback):
        def __init__(self, path_file_model, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self.__path_file_model = path_file_model
            #
            self.__path_file_history_pickle = None
            self.__history = {}
            self.__epoch = 0
            #
            self.__setup()
        #
        def __setup(self):
            self.__path_file_history_pickle = re.sub(r'.[^.]*$', '_history_pickle', self.__path_file_model)
            #
            if (os.path.isfile(self.__path_file_history_pickle)):
                with open(self.__path_file_history_pickle, 'rb') as fd:
                    self.__history = pickle.load(fd)
                    # Start from last epoch
                    self.__epoch = self.__history['e'][-1]
            #
            else:
                print("Pickled history file unavailable; the following pickled history file creation will occur after the first training epoch:nt{}".format(
                    self.__path_file_history_pickle))
        #
        def __update_history_file(self):
            with open(self.__path_file_history_pickle, 'wb') as fd:
                pickle.dump(self.__history, fd)
        #
        def on_epoch_end(self, epoch, logs=None):
            self.__epoch += 1
            logs = logs or {}
            #
            logs['e'] = self.__epoch
            logs['lr'] = K.get_value(self.model.optimizer.lr)
            #
            for k, v in logs.items():
                self.__history.setdefault(k, []).append(v)
            #
            self.__update_history_file()

最新更新