为 RLLib 模型传入自定义模型参数的正确方法?



我有一个基本的自定义模型,它本质上只是默认 RLLib 全连接模型 (https://github.com/ray-project/ray/blob/master/rllib/models/tf/fcnet.py) 的复制粘贴,我正在通过带有"custom_model_config": {}字典的配置文件传入自定义模型参数。此配置文件如下所示:

# Custom RLLib model
custom_model: test_model
# Custom options
custom_model_config:
## Default fully connected network settings
# Nonlinearity for fully connected net (tanh, relu)
"fcnet_activation": "tanh"
# Number of hidden layers for fully connected net
"fcnet_hiddens": [256, 256]
# For DiagGaussian action distributions, make the second half of the model
# outputs floating bias variables instead of state-dependent. This only
# has an effect is using the default fully connected net.
"free_log_std": False
# Whether to skip the final linear layer used to resize the hidden layer
# outputs to size `num_outputs`. If True, then the last hidden layer
# should already match num_outputs.
"no_final_linear": False
# Whether layers should be shared for the value function.
"vf_share_layers": True
## Additional settings
# L2 regularization value for fully connected layers
"l2_reg_value": 0.1

当我使用此设置开始训练过程时,RLLib 会给我以下警告:

自定义

模型 V2 应接受所有自定义选项作为 **kwargs,而不是 期待他们在配置['custom_model_config']中!

我了解 **kwargs 的作用,但我不确定如何使用自定义 RLLib 模型来实现它来修复此警告。有什么想法吗?

TL;DR:在您的网络__init__中添加**customized_model_kwargs,然后从中获取您的自定义配置。


我将向您解释如何避免此警告。

当您使用自定义网络时,您肯定会使用以下内容:

policy.target_q_model = ModelCatalog.get_model_v2(
obs_space=obs_space,
action_space=action_space,
num_outputs=1,
model_config=config["model"],
framework="torch",
name=Q_TARGET_SCOPE)

模型由 Ray 像这样实例化(参见 ModelCatalog https://docs.ray.io/en/master/_modules/ray/rllib/models/catalog.html):

instance = model_cls(obs_space, action_space, num_outputs,
model_config, name,
**customized_model_kwargs)

因此,您应该像这样声明您的网络:

def __init__(self, obs_space: gym.spaces.Space,
action_space: gym.spaces.Space, num_outputs: int,
model_config: ModelConfigDict, name: str, **customized_model_kwargs):
TorchModelV2.__init__(self, obs_space, action_space, num_outputs,
model_config, name)
nn.Module.__init__(self)

请注意customized_model_kwargs参数。

然后,您可以使用customized_model_kwargs["your_key"]来访问您的自定义配置。

注意:TF 的情况类似

您可以通过设置"custom_model_config"来传递自定义模型参数,这是模型配置的一部分。默认情况下,它是空的。

从文档中:

# Name of a custom model to use
"custom_model": None,
# Extra options to pass to the custom classes. These will be available to
# the Model's constructor in the model_config field. Also, they will be
# attempted to be passed as **kwargs to ModelV2 models. For an example,
# see rllib/models/[tf|torch]/attention_net.py.
"custom_model_config": {},

自定义模型的构造函数中有一个model_config参数。 您可以通过model_config["custom_model_config"]访问模型参数。


示例:

# setting custom params
config = ppo.DEFAULT_CONFIG.copy()
config["model"] = {
"custom_model": MyModel,
"custom_model_config": {
"my_param": 42
}
}
...
trainer = ppo.PPOTrainer(config=config, env=MyEnv)

内部MyModel

class MyModel(TFModelV2):
def __init__(self, obs_space, action_space, num_outputs, model_config, name, **kwargs):
self.my_param = model_config["custom_model_config"]["my_param"]

最新更新