Ray〔RLlib〕自定义动作分布(TorchDeterministic)



我们知道,在Box(连续动作(动作空间的情况下,对应的动作分布是DiagGaussian(概率分布(。

但是,我想使用TorchDeterministic(直接返回输入值的Action Distribution(。

这是代码,取自https://github.com/ray-project/ray/blob/a91ddbdeb98e81741beeeb5c17902cab1e771105/rllib/models/torch/torch_action_dist.py#L372:

class TorchDeterministic(TorchDistributionWrapper):
"""Action distribution that returns the input values directly.
This is similar to DiagGaussian with standard deviation zero (thus only
requiring the "mean" values as NN output).
"""
@override(ActionDistribution)
def deterministic_sample(self) -> TensorType:
return self.inputs
@override(TorchDistributionWrapper)
def sampled_action_logp(self) -> TensorType:
return torch.zeros((self.inputs.size()[0], ), dtype=torch.float32)
@override(TorchDistributionWrapper)
def sample(self) -> TensorType:
return self.deterministic_sample()
@staticmethod
@override(ActionDistribution)
def required_model_output_shape(
action_space: gym.Space,
model_config: ModelConfigDict) -> Union[int, np.ndarray]:
return np.prod(action_space.shape)

通过适当的导入,我将这个类的内容复制并粘贴到一个名为custom_action_dist.py.

我用导入它

from custom_action_dist import TorchDeterministic

向注册了我的customer_action_dist

ModelCatalog.register_custom_action_dist("my_custom_action_dist", TorchDeterministic)

并且在配置中我指定:

"custom_action_dist": "my_custom_action_dist"

然而,我得到了以下错误:

"File "/home/user/DRL/lib/python3.8/site-packages/ray/rllib/models/torch/torch_action_dist.py", line 38, in logp
return self.dist.log_prob(actions)
AttributeError: 'TorchDeterministic' object has no attribute 'dist'"

看来我必须指定一个概率分布。

有人能帮帮我吗?告诉我那是哪一个?

谢谢您,期待您的回复!

是的,在自定义类的init构造函数上。

以下是Torch类的示例:

class MultinomialTorchDistribution(TorchDistributionWrapper):
def __init__(self, inputs, model):
super().__init__(inputs, model)
self.num_samples = 100
self.dist = torch.distributions.Multinomial(total_count=self.num_samples, logits=inputs)
self.last_sample = torch.zeros_like(self.dist.probs)

最新更新