使用矢量化的梯度下降的倍频程代码没有正确更新成本函数



我已经使用矢量化实现了以下梯度下降代码,但成本函数似乎没有正确递减。相反,成本函数随着每次迭代而增加。

假设θ是n+1向量,y是m向量,X是设计矩阵m*(n+1)

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
m = length(y); % number of training examples
n = length(theta); % number of features
J_history = zeros(num_iters, 1);
error = ((theta' * X')' - y)*(alpha/m);
descent = zeros(size(theta),1);
for iter = 1:num_iters
for i = 1:n
   descent(i) = descent(i) + sum(error.* X(:,i));
   i = i + 1;
end 
theta = theta - descent;
J_history(iter) = computeCost(X, y, theta);
disp("the value of cost function is : "), disp(J_history(iter));
iter = iter + 1;
end

计算成本函数为:

function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
   H = theta' * X(i,:)';
   E = H - y(i);
   SQE = E^2;
   J = (J + SQE);
   i = i+1;
end;
J = J / (2*m);

您可以进一步向量化它:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    m = length(y); 
    J_history = zeros(num_iters, 1);
    for iter = 1:num_iters
       delta = (theta' * X'-y')*X;
       theta = theta - alpha/m*delta';
       J_history(iter) = computeCost(X, y, theta);
    end
end

您可以更好地矢量化它,如下

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
  m = length(y);
  J_history = zeros(num_iters, 1);
  for iter = 1:num_iters
     theta=theta-(alpha/m)*((X*theta-y)'*X)';
     J_history(iter) = computeCost(X, y, theta);
  end;
end;

ComputeCost函数可以写成

function J = computeCost(X, y, theta)
  m = length(y); 
  J = 1/(2*m)*sum((X*theta-y)^2);
end;
function J = computeCost(X, y, theta)
  m = length(y); 
  J = 1/(2*m)*sum((X*theta-y).^2);
end;

最新更新