更新叶张量值的正确方法是什么(例如,在梯度下降的更新步骤中)



玩具示例

考虑这个非常简单的梯度下降实现,我试图将线性回归(mx+b(拟合到一些玩具数据中。

import torch
# Make some data
torch.manual_seed(0)
X = torch.rand(35) * 5
Y = 3 * X + torch.rand(35)
# Initialize m and b
m = torch.rand(size=(1,), requires_grad=True)
b = torch.rand(size=(1,), requires_grad=True)
# Pass 1
yhat = X * m + b    #  Calculate yhat
loss = torch.sqrt(torch.mean((yhat - Y)**2)) # Calculate the loss
loss.backward()     # Reverse mode differentiation
m = m - 0.1*m.grad  # update m
b = b - 0.1*b.grad  # update b
m.grad = None       # zero out m gradient
b.grad = None       # zero out b gradient
# Pass 2
yhat = X * m + b    #  Calculate yhat
loss = torch.sqrt(torch.mean((yhat - Y)**2)) # Calculate the loss
loss.backward()     # Reverse mode differentiation
m = m - 0.1*m.grad  # ERROR

第一次通过运行良好,但第二次通过在最后一行m = m - 0.1*m.grad上出错。

错误

/usr/local/lib/python3.7/dist-packages/torch/_tensor.py:1013: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:417.)
return self._grad

我对为什么会发生这种情况的理解是,在通行证1期间,这条线路

m = m - 0.1*m.grad

m复制到一个全新的张量(即一个完全独立的内存块(中。所以,它从一个叶张量变成了一个非叶张量。

# Pass 1
...
print(f"{m.is_leaf}")  # True
m = m - 0.1*m.grad  
print(f"{m.is_leaf}")  # False

那么,如何执行更新呢

我看到它提到可以使用类似m.data = m - 0.1*m.grad的东西,但我还没有看到关于这种技术的太多讨论。

您的观察结果是正确的,为了执行更新,您应该:

  1. 使用在位运算符应用修改。

  2. 使用torch.no_grad上下文管理器包装调用。

例如:

with torch.no_grad():
m -= 0.1*m.grad  # update m
b -= 0.1*b.grad  # update b

相关内容

最新更新