如何在pytorch中绑定层的输出



我希望我的模型输出一个值,如何将值约束为(a,b(?例如,我的代码是:

class ActorCritic(nn.Module):
def __init__(self, num_state_features):
super(ActorCritic, self).__init__()
# value
self.critic_net = nn.Sequential(
nn.Linear(num_state_features, 64),
nn.ReLU(),
nn.Linear(64, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 1)
)
# policy
self.actor_net = nn.Sequential(
nn.Linear(num_state_features, 64),
nn.ReLU(),
nn.Linear(64, 128),
nn.ReLU(),
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64, 1),
)
def forward(self, state):
value = self.critic_net(state)
policy_mean = self.actor_net(state)
return value, policy_mean

并且我希望策略输出在(5003000(范围内,我该怎么做?

(我尝试过torch.clamp(),但效果不好,因为如果接近极限,策略将始终保持不变,例如输出为-1000000,然后将永远保持500,或者需要很长时间才能更改。nn.Sigmoid()等函数也是如此(

在最后一层上使用一个激活函数,将输出限制在某个范围内,然后归一化到您想要的范围。例如,sigmoid函数将输出绑定在[0,1]范围内。

output = torch.sigmoid(previous_layer_output) # in range [0,1]
output_normalized = output*(b-a) + a          # in range [a,b]

您可以使用固定的线性变换将sigmoid缩放到适当的边界:

import torch.nn.functional as F
# ...
def forward(self, state):
value = self.critic_net(state)
value = torch.sigmoid(value)
value = F.linear(value, weight=4000, bias=-500)
policy_mean = self.actor_net(state)
return value, policy_mean

最新更新