keras调谐器的相关超参数



我的目标是调整符合以下标准的可能网络架构:

  1. 第1层可以有任何数量的隐藏单元:[32,64,128,256,512]

然后,要为其余层探索的隐藏单元的数量应该始终取决于在其上面的层中进行的特定选择,特别是:

  1. 第2层可以具有与第1层相同或一半的单元
  2. 层3可以具有与层2相同或一半的单元
  3. 层4可以具有与层3相同或一半的单元

由于我目前正在实施它,hp。第2层、第3层和第4层的选项在首次建立后永远不会更新。

例如,在调谐器num_layers = 4的第一次通过时假装,这意味着将创建所有四个层。例如,如果第1层选择256个隐藏单元,则选项变为:

第2层-->[128256]

第3层-->[64128]

第4层-->[32,64]

第2层、第3层和第4层在接下来的每一次迭代中都坚持这些选择,而不是更新以适应第1层的未来选择。

这意味着在未来的迭代中,当层1中隐藏单元的数量发生变化时,层2、3和4的选项不再满足探索选项的预期目标,其中每个后续层可以包含与前一层相同或一半的隐藏单元。

def build_and_tune_model(hp, train_ds, normalize_features, ohe_features, max_tokens, passthrough_features):

all_inputs, encoded_features = get_all_preprocessing_layers(train_ds,
normalize_features=normalize_features,
ohe_features=ohe_features,
max_tokens=max_tokens,
passthrough=passthrough_features)


# Possible values for the number of hidden units in layer 1.
# Defining here because we will always have at least 1 layer.
layer_1_hidden_units = hp.Choice('layer1_hidden_units', values=[32, 64, 128, 256, 512])
# Possible number of layers to include
num_layers = hp.Choice('num_layers', values=[1, 2, 3, 4])

print("================= starting new round =====================")
print(f"Layer 1 hidden units = {hp.get('layer1_hidden_units')}")
print(f"Num layers is {hp.get('num_layers')}")


all_features = layers.concatenate(encoded_features)

x = layers.Dense(layer_1_hidden_units,
activation="relu")(all_features)

if hp.get('num_layers') >= 2:

with hp.conditional_scope("num_layers", [2, 3, 4]):

# Layer 2 hidden units can either be half the layer 1 hidden units or the same.
layer_2_hidden_units = hp.Choice('layer2_hidden_units', values=[(int(hp.get('layer1_hidden_units') / 2)),
      hp.get('layer1_hidden_units')])

print("n==========================================================")
print(f"In layer 2")
print(f"num_layers param = {hp.get('num_layers')}")
print(f"layer_1_hidden_units = {hp.get('layer1_hidden_units')}")
print(f"layer_2_hidden_units = {hp.get('layer2_hidden_units')}")
print("==============================================================n")
x = layers.Dense(layer_2_hidden_units,
activation="relu")(x)
if hp.get('num_layers') >= 3:

with hp.conditional_scope("num_layers", [3, 4]):

# Layer 3 hidden units can either be half the layer 2 hidden units or the same.
layer_3_hidden_units = hp.Choice('layer3_hidden_units', values=[(int(hp.get('layer2_hidden_units') / 2)),
      hp.get('layer2_hidden_units')])

print("n==========================================================")
print(f"In layer 3")
print(f"num_layers param = {hp.get('num_layers')}")
print(f"layer_1_hidden_units = {hp.get('layer1_hidden_units')}")
print(f"layer_2_hidden_units = {hp.get('layer2_hidden_units')}")
print(f"layer_3_hidden_units = {hp.get('layer3_hidden_units')}")
print("==============================================================n")
x = layers.Dense(layer_3_hidden_units,
activation="relu")(x)
if hp.get('num_layers') >= 4:

with hp.conditional_scope("num_layers", [4]):

# Layer 4 hidden units can either be half the layer 3 hidden units or the same.
# Extra stipulation applied here, layer 4 hidden units can never be less than 8.
layer_4_hidden_units = hp.Choice('layer4_hidden_units', values=[max(int(hp.get('layer3_hidden_units') / 2), 8),
      hp.get('layer3_hidden_units')])

print("n==========================================================")
print(f"In layer 4")
print(f"num_layers param = {hp.get('num_layers')}")
print(f"layer_1_hidden_units = {hp.get('layer1_hidden_units')}")
print(f"layer_2_hidden_units = {hp.get('layer2_hidden_units')}")
print(f"layer_3_hidden_units = {hp.get('layer3_hidden_units')}")
print(f"layer_4_hidden_units = {hp.get('layer4_hidden_units')}")
print("==============================================================n")
x = layers.Dense(layer_4_hidden_units,
activation="relu")(x)

output = layers.Dense(1, activation='sigmoid')(x)

model = tf.keras.Model(all_inputs, output)

model.compile(optimizer=tf.keras.optimizers.Adam(),
metrics = ['accuracy'],
loss='binary_crossentropy')

print(">>>>>>>>>>>>>>>>>>>>>>>>>>>> End of round <<<<<<<<<<<<<<<<<<<<<<<<<<<<<")

return model

有人知道告诉Keras Tuner探索每层隐藏单元的所有可能选项的正确方法吗?其中,要探索的区域满足以下标准:第一层之后的每层都可以拥有与前一层相同或一半的隐藏单元,并且第一层可以拥有列表中的多个隐藏单元[32,64,128,256,512]?

在应用程序运行时,无法更新生成的超参数及其相关的选择选项。让我们考虑一个例子来说明这一点:

试验1:

  • first_layer_units:[32,64,128,256,512]

假设随机选择值256作为first_layer_units的单位计数。然后,基于此选择:

  • first_hidden_layer_units:[128256]

假设first_hidden_layer_units选择了128。随后:

  • second_hidden_layer_units:[64128]

假设second_hidden_layer_units选择了64。最后:

  • third_hidden_layer_units:[32,64]

现在,让我们进入试用版2:

试验2:

  • first_layer_units:[32,64,128,256,512]

假设这次随机选择值64作为first_layer_units的单位计数。理想情况下,我们预计隐藏层超参数的选择将相应更新:

  • first_hidden_layer_units:[32,64]

但是,使用Keras Tuner时会出现问题,因为它不会根据first_layer_units的新值更新隐藏层超参数的选择。相反,它保留了试用版1中的选项。此外,超参数second_hidden_layer_unitsthird_hidden_layer_units仍然有效,即使它们是在试验1中生成的,并且不适用于试验2。

为了解决第一个问题,我们需要为每个场景生成单独的超参数集。这可以通过基于total_layer_countprevious_layer_index:动态生成超参数名称来实现

current_layer_index = previous_layer_index - 1
hidden_units = hp.Choice(f'hidden_units_layer_{total_layer_count}_{current_layer_index}', values=[(int(hp.get(f'hidden_units_layer_{total_layer_count}_{previous_layer_index}') / 2)), hp.get(f'hidden_units_layer_{total_layer_count}_{previous_layer_index}')])

通过为每个独特的场景生成不同的超参数,我们确保为每个场景适当更新超参数。

为了解决第二个问题并停用为其他场景创建的超参数,我们可以使用条件范围建立父子关系。这确保了只有在父超参数处于活动状态时才激活子超参数。通过这样做,我们可以禁用为其他场景生成的所有超参数。条件范围可以实现如下:

with hp.conditional_scope(parent_hp_name, parent_hp_value):
hidden_units = hp.Choice(child_hp_name, values=child_hp_value)

使用这种方法,只有当父超参数满足指定条件时,子超参数才会是活动的。

总之,解决这两个问题的最终代码片段的结构如下:

# List possible units
possible_units = [32, 64, 128, 256, 512]
possible_layer_units = []
for index, item in enumerate(possible_units[:-1]):
possible_layer_units.append([item, possible_units[index + 1]])
# possible_layer_units = [[32, 64], [64, 128], [128, 256], [256, 512]] 
# Add first layer
all_features = layers.concatenate(encoded_features) 
first_layer_units = hp.Choice('first_layer_units', values=possible_units)
x = layers.Dense(first_layer_units, activation="relu")(all_features)
# Get the number of hidden layers based on first layer unit count
hidden_layer_count = possible_units.index(first_layer_units)
if 0 < hidden_layer_count:
iter_count = 0
for hidden_layer_index in range(hidden_layer_count - 1, -1, -1):
if iter_count == 0:
# Collect HP 'units' details for the second layer
parent_hp_name = 'first_layer_units'
parent_hp_value = possible_layer_units[hidden_layer_index]
child_hp_name = 'units_layer_' + str(hidden_layer_count) + str(hidden_layer_index)
child_hp_value = parent_hp_value
else:
# Collect HP 'units' details for the next layers
parent_hp_name = 'units_layer_' + str(hidden_layer_count) + str(hidden_layer_index + 1)
parent_hp_value = possible_layer_units[hidden_layer_index + 1]
child_hp_name = 'units_layer_' + str(hidden_layer_count) + str(hidden_layer_index)
child_hp_value = possible_layer_units[hidden_layer_index]
# Add and Activate child HP under parent HP using conditional scope
with hp.conditional_scope(parent_hp_name, parent_hp_value):
hidden_units = hp.Choice(child_hp_name, values=child_hp_value)

# Add remaining NN layers one by one
x = layers.Dense(hidden_units, activation="relu")(x)
iter_count += 1

通过基于前一层的单位计数动态生成超参数,并利用条件范围来控制它们的激活,我们可以有效地解决这两个问题。

最新更新