HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Neural Network in Julia

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
neuralnetworkjulia

Problem

I am currently trying to implement a Neural Net in Julia with the goal of eventually implementing a stacked autoencoder. My code seems to work but I would appreciate any constructive criticism. If there exists a style guide for Julia, I am not concerned with that. However, any other comments would be very much welcome. I would also like to be able to write an implementation that can be extended to more complicated architectures without making significant alterations to the basics of the code. This is not that in any way but ideas on how to do this would be very helpful.

```
type ANN2

#
# Neural Network type...
#

# define vars
weights::Dict
bias::Dict
As::Dict
Ns::Dict
Fs::Dict
Ss::Dict
weightdelta::Dict
biasdelta::Dict
shape::Array{Int64,1}
numlayers::Int64
averror::Float64

# define methods
forward::Function
calcuate_deltas::Function
init::Function
setshape::Function
sgm::Function
updateone::Function
updateepoch::Function
calculate_error::Function

# Constructer
function ANN2()
this = new ()

this.weights = Dict{Int64,Any}()
this.bias = Dict{Int64,Any}()
this.As = Dict{Int64,Any}()
this.Ns = Dict{Int64,Any}()
this.Fs = Dict{Int64,Any}()
this.weightdelta = Dict{Int64,Any}()
this.biasdelta = Dict{Int64,Any}()
this.Ss = Dict{Int64,Any}()
this.numlayers = 0

# Set the shape of the network
this.setshape = function(shape)
this.shape = shape
this.numlayers = size(this.shape)[1] - 1
return nothing
end

# initialise weights and bias
this.init = function()
for (ind,(a,b)) in enumerate(zip(this.shape[1:end-1],this.shape[2:end]))
this.weights[ind] = rand(b,a)
this.bias[ind] = rand(b)
end
return nothing
end

# Calculate output of network given one input
this.forward = function (input::Array{Float64,1})
this.As[0] = input
for i = 1:this.numlayers
this.Ns[i] = net.weights[i]*this.As[i-1] + n

Solution

Coding Style

  • "Methods", as in Functions which are part of a type are not Julianic.



  • Instead use Functions with typed parameters



  • Rather than ann.forward(input::Array{Float64,1})



  • use forward(ann::ANN2, input::Array{Float64,1})



  • Functions which mutate there inputs should end with a bang (!)



  • So infact: forward!(ann::ANN2, input::Array{Float64,1})



  • It Vector{T} is a type alias for Array{T,1} it is cleaner to read



  • So forward!(ann::ANN2, input::Vector{Float64})



-
Don't over type specify. Input doesn't really have to be a Float64, it could be any kind of number. So instead use

  • So forward!{N



  • The most important thing this will allow you to do is Gradient Checking with ForwardDiff.jl



  • Gradient Checking your neural networks is very important as the back propagation algorithm is very hard, and fiddly, and it kinda works even if you have it wrong -- making the mistake hard to notice. See: here. People don't appreciate just how important doing a gradient check is.



-
You do not need return statements at the end of functions, functions implictly return the last statement. So rather than
return 1./(1+exp(-x)), you just write 1./(1+exp(-x)) as the last line of the sigmoid function

-
Returning
nothing is not a normal practice in Julia.

  • Functions which Mutate (modify) one of their inputs normally return the modified version of that input for fluid programming eg sort!(xs) will sort xs inplace and then return (the now sorted) xs.



  • Functions which don't modify their inputs, normally have some output to return (otherwise why were they called?). The exception is logging functions and that kinda thing. But they can just be allowed to return the return value of the last function they called. Which will likely be nothing



-
use clearer names:
As, Ns, Fs, Ss what are these? I don't know.

  • Ws and bs can be understood to be Weights and Baises as that is a very common notation -- but you don't use those they are named in full.



  • If you are using naming from a particular paper, you should add a link to that paper in a comment and have a comment signing what each name is for.



  • I can guess As is the activations of each layer.



  • on reading closer, looks like Fs[i] is the derivivitive of sigmoid(As[i]). Why are you calculating that doing feedforward? It is part of the back propagation step?



  • Ss is the error signal, I think



  • Ns is the input to each sigmoid



  • Working that out took me over 10 minutes



  • Int keyed Dicts are a bit of a code smell. Particularly when they are consecutively indexed.



  • I see why you are using them -- you want zero indexes lists



  • I would rethink that, and if you still decide that it is the best way at least leave comment so your future self knows that are actually lists.



-
It would be clearer (and might be marginally faster) if you gave those Dicts (by which I mean lists) a fixed return type rather than
Any. They do have a constant return type (I think they are all Vector{Float64}).

  • This would also allow the type checker to catch some logic errors



-
calcuate_deltas is spelt wrong. correct is calculate_deltas

-
rather than having if statements inside a for-loop, checking if this is the first index and do almost entirely different code, you could instead just loop over the later indexes.
So rather than:

for i in reverse(1:this.numlayers)
    if i == this.numlayers
        this.Ss[i] = this.Fs[i].*(this.As[i] - target)
        if avg
            this.weightdelta[i] = this.weightdelta[i]+rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = this.biasdelta[i]+rate.*this.Ss[i]
        else
            this.weightdelta[i] = rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = rate.*this.Ss[i]
        end
    else
        this.Ss[i] = this.Fs[i].*(this.weights[i+1]'*this.Ss[i+1])
        if avg
            this.weightdelta[i] = this.weightdelta[i]+rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = this.biasdelta[i]+rate.*this.Ss[i]
        else
            this.weightdelta[i] = rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = rate.*this.Ss[i]
        end
    end
end


do

``
this.Ss[this.numlayers] = this.Fs[this.numlayers].*(this.As[this.numlayers] - target)
if avg
this.weightdelta[this.numlayers] = this.weightdelta[this.numlayers]+rate.(this.Ss[this.numlayers]this.As[this.numlayers-1]')
this.biasdelta[this.numlayers] = this.biasdelta[this.numlayers]+rate.*this.Ss[this.numlayers]
else
this.weightdelta[this.numlayers] = rate.(this.Ss[this.numlayers]this.As[this.numlayers-1]')
this.biasdelta[this.numlayers] = rate.*this.Ss[this.numlayers]
end

for i in this.numlayers-1 :-1: 1
this.Ss[i] = this.Fs[i].(this.weights[i+1]'this.Ss[i+1])
if avg
this.weightdelta[i] = this.weightdelta[i]+rate.(this.Ss[i]this.As[i-1]')
this.biasdelta[i] = this.biasdelta[i]+rate.*this.Ss[i]
else

Code Snippets

for i in reverse(1:this.numlayers)
    if i == this.numlayers
        this.Ss[i] = this.Fs[i].*(this.As[i] - target)
        if avg
            this.weightdelta[i] = this.weightdelta[i]+rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = this.biasdelta[i]+rate.*this.Ss[i]
        else
            this.weightdelta[i] = rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = rate.*this.Ss[i]
        end
    else
        this.Ss[i] = this.Fs[i].*(this.weights[i+1]'*this.Ss[i+1])
        if avg
            this.weightdelta[i] = this.weightdelta[i]+rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = this.biasdelta[i]+rate.*this.Ss[i]
        else
            this.weightdelta[i] = rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = rate.*this.Ss[i]
        end
    end
end
this.Ss[this.numlayers] = this.Fs[this.numlayers].*(this.As[this.numlayers] - target)
if avg
    this.weightdelta[this.numlayers] = this.weightdelta[this.numlayers]+rate.*(this.Ss[this.numlayers]*this.As[this.numlayers-1]')
    this.biasdelta[this.numlayers] = this.biasdelta[this.numlayers]+rate.*this.Ss[this.numlayers]
else
    this.weightdelta[this.numlayers] = rate.*(this.Ss[this.numlayers]*this.As[this.numlayers-1]')
    this.biasdelta[this.numlayers] = rate.*this.Ss[this.numlayers]
end

for i in this.numlayers-1 :-1: 1
       this.Ss[i] = this.Fs[i].*(this.weights[i+1]'*this.Ss[i+1])
       if avg
            this.weightdelta[i] = this.weightdelta[i]+rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = this.biasdelta[i]+rate.*this.Ss[i]
       else
            this.weightdelta[i] = rate.*(this.Ss[i]*this.As[i-1]')
            this.biasdelta[i] = rate.*this.Ss[i]
       end
end

Context

StackExchange Code Review Q#69628, answer score: 4

Revisions (0)

No revisions yet.