权重的修正是否也包括Sigmoid函数的导数

让我们在下面给出的代码块中评估这一行的使用情况。L1_delta = L1_error * nonlin(L1,True) # line 36

import numpy as np #line 1
# sigmoid function
def nonlin(x,deriv=False):
if(deriv==True):
return x*(1-x)
return 1/(1+np.exp(-x))
# input dataset
X = np.array([  [0,0,1],
[0,1,1],
[1,0,1],
[1,1,1] ])
# output dataset            
y = np.array([[0,0,1,1]]).T
# seed random numbers to make calculation
# deterministic (just a good practice)
np.random.seed(1)
# initialize weights randomly with mean 0
syn0 = 2*np.random.random((3,1)) - 1
for iter in range(1000):
# forward propagation
L0 = X
L1 = nonlin(np.dot(L0,syn0))
# how much did we miss?
L1_error = y - L1
# multiply how much we missed by the 
# slope of the sigmoid at the values in L1
L1_delta = L1_error * nonlin(L1,True) # line 36
# update weights
syn0 += np.dot(L0.T,L1_delta)
print ("Output After Training:")
print (L1)

我想知道，线路是必需的吗？为什么我们需要Sigmoid的导数因子？

我见过许多类似的逻辑回归例子，其中没有使用Sigmoid的导数。例如https://github.com/chayankathuria/LogReg01/blob/master/GradientDescent.py

是的，这行确实是必需的。您需要激活函数的导数(在本例中为sigmoid(，因为您的最终输出仅隐含地取决于权重。这就是为什么你需要应用链式规则，s形导数会出现。

我建议你看看这个关于反向传播的帖子：https://datascience.stackexchange.com/questions/28719/a-good-reference-for-the-back-propagation-algorithm

它很好地解释了反向传播背后的数学原理。

相关内容

最新更新

热门标签：