如何在加载另一个数据集时清理GPU内存



我正在训练CNN网络,比较两种类型的输入数据(3秒和30秒)的音频频谱图。这导致了实验中不同的光谱图大小。

我用这个来获取数据:

def get_data(data_type, batch_size):
assert data_type in ['3s', '30s'], "data_type shoulbe either 3s or 30s"
if data_type == '3s':
audio_dir = DATA_PATH / 'genres_3_seconds'
max_signal_length_to_crop = 67_500
elif data_type == '30s':
audio_dir = DATA_PATH / 'genres_original'
max_signal_length_to_crop = 660_000
input_shape = (max_signal_length_to_crop, 1)
train_ds, val_ds = tf.keras.utils.audio_dataset_from_directory(
directory=audio_dir,
batch_size=batch_size,
validation_split=0.2,
output_sequence_length=max_signal_length_to_crop,
subset='both',
label_mode='categorical'
)
test_ds = val_ds.shard(num_shards=2, index=0)
val_ds = val_ds.shard(num_shards=2, index=1)
return train_ds, val_ds, test_ds, input_shape

我使用这个函数来创建模型。

def get_model(model_type, data_type, input_shape):
if data_type == '3s':
WIN_LENGTH = 1024 * 2
FRAME_STEP = int(WIN_LENGTH / 4)  # / 4 a nie /2
elif data_type == '30s':
WIN_LENGTH = 1024 * 4
FRAME_STEP = int(WIN_LENGTH / 2)  # / 4 a nie /2
specrtogram_layer = 
kapre.composed.get_melspectrogram_layer(input_shape=input_shape, win_length=WIN_LENGTH, hop_length=FRAME_STEP)
model = Sequential([
specrtogram_layer,
*model_dict[model_type],
Dense(units=10, activation='softmax', name='last_dense')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=START_LR),
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'],
)
return model
model_dict = {
'CNN_Basic': [
Conv2D(filters=8, kernel_size=3, activation='relu'),
MaxPooling2D(2),
Conv2D(filters=16, kernel_size=3, activation='relu'),
MaxPooling2D(2),
Conv2D(filters=32, kernel_size=3, activation='relu'),
MaxPooling2D(2),
Flatten(),
Dense(units=128, activation='relu'),
],
...
}

我在一个循环中运行不同架构的几个实验。这是我的训练循环:

for data_type in ['3s', '30s']:
train_ds, val_ds, test_ds, input_shape = get_data(data_type=data_type, batch_size=30)
for model_type in ['CNN_Basic', ...]:
model = get_model(model_type, input_shape=input_shape, data_type=data_type)
model.fit(train_ds, epochs=epochs, validation_data=val_ds)

我得到的错误:

Traceback (most recent call last):
File "...libsite-packagestensorflowpythontrackablebase.py", line 205, in _method_wrapper
result = method(self, *args, **kwargs)
File "...libsite-packageskerasutilstraceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "...libsite-packagestensorflowpythonframeworkops.py", line 1969, in _create_c_op
raise ValueError(e.message)
ValueError: Exception encountered when calling layer "dense" (type Dense).
Dimensions must be equal, but are 17024 and 6272 for '{{node dense/MatMul}} = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false](Placeholder, dense/MatMul/ReadVariableOp)' with input shapes: [?,17024], [6272,128].
Call arguments received by layer "dense" (type Dense):
• inputs=tf.Tensor(shape=(None, 17024), dtype=float32)

我认为这是由数据集引起的,因为我只有在30秒的光谱图之后用3秒的光谱图进行实验时才得到这个错误。我每次都创建新的模型,为了加载数据,我使用tf.keras.utils.audio_dataset_from_directory并在下面的循环迭代中将其加载到相同的变量中。

我明白了。你不能像我的例子那样创建一个模型或模型的一部分,并期望从字典中取出它时它能像全新的一样工作。我通过创建一个函数来修复它,该函数在每次构建新模型时被调用,并返回我新的层实例。使用字典,我得到了"正确的"。层,但他们使用了一个,在拟合之后,他们改变了他们在内存中的状态,并期望某种类型的输入数据,当我试图在另一组数据上再次运行它时。

吸取的教训是,当你打算在不同的模型中重用你的层时,不要用它们创建变量(列表、字典等)。下面是用固定代码剪辑的代码:

def get_internal_model(model_type):
if model_type == 'CNN_Basic':
internal_layers = [
Conv2D(filters=8, kernel_size=3, activation='relu'),
MaxPooling2D(2),
Conv2D(filters=16, kernel_size=3, activation='relu'),
MaxPooling2D(2),
Conv2D(filters=32, kernel_size=3, activation='relu'),
MaxPooling2D(2),
Flatten(),
Dense(units=128, activation='relu'),
]
...
return internal_layers

def get_model(model_type, data_type, input_shape):
specrtogram_layer = get_spectrogram_layer(input_shape, data_type)
model = Sequential([
specrtogram_layer,
*get_internal_model(model_type), # HERE
Dense(units=10, activation='softmax', name='last_dense')
])
...

最新更新