我正在尝试使用两个输入特征来预测单个输出值y。我读到回归模型通常不使用任何激活函数,即使应用它们也主要应用于隐藏层。然而,当我不使用甚至只在隐藏层上使用它时,我的预测值与实际值相差甚远。
这是我的matlab函数,用于计算损失函数以及反向传播算法。
function [J grad] = nnCostFunction1(nn_params, ...
input_layer_size, ...
hidden_layer_size, ...
num_labels, ...
X, y, lambda)
% Reshape nn_params back into the parameters Theta1 and Theta2, the weight matrices
% for 2 layer neural network
Theta1 = reshape(nn_params(1:hidden_layer_size * (input_layer_size + 1)), ...
hidden_layer_size, (input_layer_size + 1));
Theta2 = reshape(nn_params((1 + (hidden_layer_size * (input_layer_size + 1))):end), ...
num_labels, (hidden_layer_size + 1));
%Initialising the variables
m = size(X, 1);
X = [ones(m,1) X];
J = 0;
Theta1_grad = zeros(size(Theta1));
Theta2_grad = zeros(size(Theta2));
%feed forward
z_1 = X*Theta1';
A_1 = tanh(z_1);
A_1 = ([ones(m, 1) z_1]);
z_2 = (A_1*Theta2');
J = J + sum(((z_2 - y).^2),1);
J = J/(2*m);
%Regularizing the cost function
J = J + (lambda/(2*m))*(sum((sum((Theta1(:,2:size(Theta1,2)).^ 2),1)),2) + sum((sum((Theta2(:,2:size(Theta2,2)).^ 2),1)),2));
%Backpropagation
delta_3 = z_2-y;
delta_2 = (delta_3 * Theta2(:,2:end)).*tanhGradient(z_1);
size(delta_2);
Delta_1 = delta_2' * X;
Delta_2 = delta_3' * A_1;
Theta1_grad = Delta_1/m;
Theta2_grad = Delta_2/m;
Theta1_grad(:,2:end) = Theta1_grad(:,2:end) + (lambda/m)*Theta1(:,2:end);
Theta2_grad(:,2:end) = Theta2_grad(:,2:end) + (lambda/m)*Theta2(:,2:end);
grad = [Theta1_grad(:);Theta2_grad(:)];
end
这是我的tanhgradient函数代码:
function g = tanhGradient(z)
g = zeros(size(z));
g = 1 - tanh(z).^2;
这是我如何实现我的学习算法。
clear
data = load("data.txt");
X = data(:,1:2);
y = data(:,3)
input_layer_size = 2;
hidden_layer_size = 16;
num_labels = 1;
%initialising the weights
initial_Theta1 = randInitializeWeights(input_layer_size, hidden_layer_size);
initial_Theta2 = randInitializeWeights(hidden_layer_size, num_labels);
initial_nn_params = [initial_Theta1(:) ; initial_Theta2(:)];
%Learning the weights using fmincg
options = optimset('MaxIter', 100);
lambda = 1;
% Create "short hand" for the cost function to be minimized
costFunction = @(p) nnCostFunction1(p, input_layer_size, hidden_layer_size, num_labels, X, y, lambda);
[nn_params, ~] = fmincg(costFunction, initial_nn_params, options);
我做预测的代码:
function p = predict(Theta1, Theta2, X)
m = size(X, 1);
num_labels = size(Theta2, 1);
p = zeros(size(X, 1), 1);
h1 = tanh([ones(m, 1) X] * Theta1');
p = ([ones(m, 1) h1] * Theta2');
end
我得到的输出是,第一列是预测值,第二列是实际值
我的数据集是这样的,前两列包含输入特征,最后一列是输出特征。我的数据集包含850个示例
所以线性回归模型不使用激活函数。它们的形式是y = wi*xi+ b,其中x和w与您的特征大小相同(因此每个输入特征有1个权重)。这不是深度学习模型。
但是如果你在谈论隐藏层,那么你可能在谈论一个深度学习模型。如果你正在为一个回归问题创建一个密集的线性神经网络,你几乎肯定想使用激活函数。