patternpythonMinor

Academic implementation of artificial neural network

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

neuralartificialimplementationnetworkacademic

Problem

With some free time, I decided to study artificial neural networks as an academic exercise (not homework). Over the course of my studies, I decided to write a Python application that would allow me to create and train arbitrary feed forward networks so I could see the concrete math. So far, the test harness only includes simple AND, OR, NOT, and (slightly less simple) XOR with perfect inputs.

Quite honestly, in 8 years of professional development, I've never had my code reviewed. What I am looking for in code review is commentary on:

my use of Python language features (not the decision to use Python, mind)

the data structures and logical flows - especially if I have a logical flaw somewhere

general program design

I'd also gladly welcome expert advice on ANN in general.

perceptron.py

```
"""
Equations from http://ecee.colorado.edu/~ecen4831/lectures/NNet3.html

t[i] = desired output of unit i
a[i] = actual output of unit i
a[i] = f(u[i]) = 1 / (1 - exp(-u[i])) # or experiment with other sigmoid functions
u[i] = sum(j=0..numinputs, W[i][j]*a[j]) + theta[i]

delta[i] = t[i] - a[i]
E = sum(i, pow(delta, 2)) # only over output units because we don't know what the activation of hidden units should be

dW[i][j] proportional to -(derivative(E) / derivative(W[i][j]))
. # apply chain rule
. = -(derivative(E)/derivative(a[i]))(derivative(a[i])/derivative(W[i][j]))
. = 2(t[i] - a[i])a[i](1-a[i])*a[j]
. (NOTE: df[i] / du[i] = f[i](1 - f[i]) = a[i](1 - a[i]))

if unit i is output:
. change_rate[i] = (t[i] - a[i]) a[i] (1-a[i])
if unit i is hidden, we don't know the target activation, so we estimate it:
. change_rate[i] = (sum(k, dr1[k] W[k][i])) a[i] * (1-a[i])
. ASSUME: sum(k, dr1[k] * W[k][i]) is approximately (t[i] - a[i])
. (The sum over k is the sum over the units that receive input from the ith unit.)

dW[i]j = learning rate change_rate[i] a[j] + a

Solution

Consider breaknig out the training logic from the network so there can be instances of non-trainable perceptrons. Presumably in this case there would be a mechanism for saving and loading weight data (by using pickle for example), which becomes easier without the pollution of the training data.

The "enum" TrainingStrategy is for controlling program flow. There should be a more organic way of doing this - "train_pattern(...)" and "train_epoch(...)".

The training sets include node index with the expected output. This prevents using the same training set on different configurations - or at least - severely complicates it. It would be more flexible for expected outputs to be a tuple that the training logic can interpret based on network configuration.

Here is example using a graph to implement node connections (doesn't include training logic):

class Node:
    def __init__(self, node_index):
        self.node_index = node_index
        self.output_map = []

    def map_output(self, node, position):
        self.output_map.append((node,position))

    def __repr__(self):
        return '{}:{}'.format(self.node_index, repr(self.output_map))

class InputNode(Node):
    def __init__(self, node_index):
        Node.__init__(self, node_index)
        self.input = 0

    def evaluate(self):
        a = self.input
        for node,position in self.output_map:
            node.input[position] = a

class Neuron(Node):
    def __init__(self, node_index, num_inputs):
        Node.__init__(self, node_index)
        self.weights = [0] * (num_inputs + 1)
        self.input = [0] * (num_inputs) + [1]
        self.u = 0
        self.a = 0

    def evaluate(self):
        def g(v, w):
            return sum(v[i] * w[i] for i in range(len(v)))

        def f(u):
            return 0 if u < 0 else 1

        self.u = g(self.input, self.weights)
        self.a = f(self.u)

        for node,position in self.output_map:
            node.input[position] = self.a

def configure(config):
    in_idx, map_output = config
    nodes = [InputNode(idx) for idx in in_idx]
    for node_index, mapping in map_output:
        node = Neuron(node_index, len(mapping))
        nodes.append(node)
        for position, in_node_index in enumerate(mapping):
            nodes[in_node_index].map_output(node, position)
    return nodes

def test(nodes, tset):
    for i,t in tset:
        nodes[0].input = i[0]
        nodes[1].input = i[1]

        for n in nodes:
            n.evaluate()

        output_nodes = [node for node in nodes if len(node.output_map) == 0]
        first = min(node.node_index for node in output_nodes)
        for node in output_nodes:
            node_index = node.node_index
            tn = t[node_index - first]
            print ('V{} = {}, A{} = {}, T{} = {}, D{} = {}'.format(node_index, node.input, node_index, node.a, node_index, tn, node_index, tn - node.a))

tmnodes = configure(((0,1,2,3),[(4,(0,1,2,3)),(5,(0,1,2,3)),(6,(4,5)),(7,(4,5)),(8,(4,5)),(9,(4,5))]))
tnodes = configure(((0,1),[(2,(0,1)),(3,(0,2,1))]))

mnodes = [InputNode(0), InputNode(1), Neuron(2, 2), Neuron(3, 3)]
mnodes[0].map_output(mnodes[2], 0)
mnodes[0].map_output(mnodes[3], 0)
mnodes[1].map_output(mnodes[2], 1)
mnodes[1].map_output(mnodes[3], 2)
mnodes[2].map_output(mnodes[3], 1)

print (tnodes)
print (mnodes)
print (tmnodes)

BITS = (0,1)
TSET_XOR = [((x,y),((x^y),)) for x in BITS for y in BITS]
TSET_ENCODER = [((w,x,y,z),(w,x,y,z)) for w in BITS for x in BITS for y in BITS for z in BITS]

test(tnodes, TSET_XOR)
test(mnodes, TSET_XOR)
test(tmnodes, TSET_ENCODER)

Code Snippets

class Node:
    def __init__(self, node_index):
        self.node_index = node_index
        self.output_map = []

    def map_output(self, node, position):
        self.output_map.append((node,position))

    def __repr__(self):
        return '{}:{}'.format(self.node_index, repr(self.output_map))

class InputNode(Node):
    def __init__(self, node_index):
        Node.__init__(self, node_index)
        self.input = 0

    def evaluate(self):
        a = self.input
        for node,position in self.output_map:
            node.input[position] = a

class Neuron(Node):
    def __init__(self, node_index, num_inputs):
        Node.__init__(self, node_index)
        self.weights = [0] * (num_inputs + 1)
        self.input = [0] * (num_inputs) + [1]
        self.u = 0
        self.a = 0

    def evaluate(self):
        def g(v, w):
            return sum(v[i] * w[i] for i in range(len(v)))

        def f(u):
            return 0 if u < 0 else 1

        self.u = g(self.input, self.weights)
        self.a = f(self.u)

        for node,position in self.output_map:
            node.input[position] = self.a

def configure(config):
    in_idx, map_output = config
    nodes = [InputNode(idx) for idx in in_idx]
    for node_index, mapping in map_output:
        node = Neuron(node_index, len(mapping))
        nodes.append(node)
        for position, in_node_index in enumerate(mapping):
            nodes[in_node_index].map_output(node, position)
    return nodes

def test(nodes, tset):
    for i,t in tset:
        nodes[0].input = i[0]
        nodes[1].input = i[1]

        for n in nodes:
            n.evaluate()

        output_nodes = [node for node in nodes if len(node.output_map) == 0]
        first = min(node.node_index for node in output_nodes)
        for node in output_nodes:
            node_index = node.node_index
            tn = t[node_index - first]
            print ('V{} = {}, A{} = {}, T{} = {}, D{} = {}'.format(node_index, node.input, node_index, node.a, node_index, tn, node_index, tn - node.a))

tmnodes = configure(((0,1,2,3),[(4,(0,1,2,3)),(5,(0,1,2,3)),(6,(4,5)),(7,(4,5)),(8,(4,5)),(9,(4,5))]))
tnodes = configure(((0,1),[(2,(0,1)),(3,(0,2,1))]))

mnodes = [InputNode(0), InputNode(1), Neuron(2, 2), Neuron(3, 3)]
mnodes[0].map_output(mnodes[2], 0)
mnodes[0].map_output(mnodes[3], 0)
mnodes[1].map_output(mnodes[2], 1)
mnodes[1].map_output(mnodes[3], 2)
mnodes[2].map_output(mnodes[3], 1)

print (tnodes)
print (mnodes)
print (tmnodes)

BITS = (0,1)
TSET_XOR = [((x,y),((x^y),)) for x in BITS for y in BITS]
TSET_ENCODER = [((w,x,y,z),(w,x,y,z)) for w in BITS for x in BITS for y in BITS for z in BITS]

test(tnodes, TSET_XOR)
test(mnodes, TSET_XOR)
test(tmnodes, TSET_ENCODER)

Context

StackExchange Code Review Q#18164, answer score: 3

Revisions (0)

No revisions yet.