如何在pytorch中绑定层的输出

我希望我的模型输出一个值，如何将值约束为(a，b(？例如，我的代码是：

class ActorCritic(nn.Module):
def __init__(self, num_state_features):
super(ActorCritic, self).__init__()
# value
self.critic_net = nn.Sequential(
nn.Linear(num_state_features, 64),
nn.ReLU(),
nn.Linear(64, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 1)
)
# policy
self.actor_net = nn.Sequential(
nn.Linear(num_state_features, 64),
nn.ReLU(),
nn.Linear(64, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 1),
)
def forward(self, state):
value = self.critic_net(state)
policy_mean = self.actor_net(state)
return value, policy_mean

并且我希望策略输出在(5003000(范围内，我该怎么做？

(我尝试过torch.clamp()，但效果不好，因为如果接近极限，策略将始终保持不变，例如输出为-1000000，然后将永远保持500，或者需要很长时间才能更改。nn.Sigmoid()等函数也是如此(

在最后一层上使用一个激活函数，将输出限制在某个范围内，然后归一化到您想要的范围。例如，sigmoid函数将输出绑定在[0,1]范围内。

output = torch.sigmoid(previous_layer_output) # in range [0,1]
output_normalized = output*(b-a) + a          # in range [a,b]

您可以使用固定的线性变换将sigmoid缩放到适当的边界：

import torch.nn.functional as F
# ...
def forward(self, state):
value = self.critic_net(state)
value = torch.sigmoid(value)
value = F.linear(value, weight=4000, bias=-500)
policy_mean = self.actor_net(state)
return value, policy_mean

相关内容

最新更新

热门标签：