HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Implementation of Logistic Regression

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
implementationlogisticregression

Problem

Is this kind of vectorized operations the most efficient way to do this in matlab? Any critics about my code? Am I doing something wrong (i tested several times, I think it works). Notice that I use J to store the history of the cost function so I can see how well it is converging (by plotting a graph for instance).

main function

function [theta, J_history] = logRegGradientDescent(X, y, theta, alpha, num_iters)
% Given a matrix X where the columns are features and a matrix Y of targets
% we apply the gradientDescent to minimize the cost function and find its
% local optimum. Alpha is the learning rate on which we look for a local
% minimum and num_iters is the amount of times we repeat the learning step.

J_history = zeros(num_iters);

for iter = 1:num_iters

    % Derivative of the cost function used, the square error in that case.
    dLogisticCostFunction = (1/m) * X' * (logisticFunction(X,theta) - y);

    % Learning step
    theta = theta - alpha * dLogisticCostFunction;

    % Save the cost function for convergence analysis
    J_history(iter) = logRegCostFunction(X,y,theta);
end
end


logistic function

function h = logisticFunction(X,theta)
% Compute the logistic function.
% If X is a matrix such as:
%
%    x1_ x2_ x3_ .. xn_;
%  [ x11 x12 x13 .. x1n;
%    x21 x22 x23 .. x2n;
%    ..  ..  ..  .. .. ;
%    xn1 xn2 xn3 .. xnn; ]
%
% and thetha' is a vector:
%  [ t0, t1, t3 .. tn ]
%
% We calculate the logistic function:
% 1/ ( 1 + e^(-sum(x*theta)))

h = 1 ./ ( 1  + exp(-X*theta) );
end


logistic cost function

```
function J = logRegCostFunction(X,y,theta)
% Compute a convex cost function to the Logistical Regression where
% if y = 1 and the logistic function predicts y = 0, cost -> inf
% and if y = 0 and the logistic fucntion predicts y = 1, cost -> inf

% Calculates number of Features
m = length(y);

% Calculates the case where if y = 1, Cost = -log(h(x))
ify1 = log(logisticFunction(X,theta)).*y;

% Calculates the case where if y = 0, Cos

Solution

Your code looks fine and is vectorized! You could've written a single line logistic cost function, but I believe your approach is more readable. Good job!

I don't think there is much more optimizations that you can do related to the basic form of logistic regression.

A possible addition however is to add regularization. This helps for the scenario of overfitting, where a high order polynomial function tries to fit perfectly according to the dataset in such a way that it doesn't make sense anymore.

An example:

regularization tries to prevent this by reducing the magnitude of the parameters θ (penalizing the parameters). You will end up with a smoother curve which fits the data and gives a better hypothesis.

To accomplish this you will have to introduce another parameter, λ (regularization parameter).

I suggest you to look further into this yourself, perhaps here is a good starting point.

I hope this helps a bit.

Context

StackExchange Code Review Q#80128, answer score: 7

Revisions (0)

No revisions yet.