patternMinor

ANN - Backpropagation with multiple output neurons

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

machine-learning artificial-intelligence cs stackoverflow algorithms neural-networks

backpropagationneuronswithoutputannmultiple

Problem

Can I utilize the backpropagation algorithm in a layered, feed-forward ANN in instances where there are multiple output neurons? If so, how? Links to (somewhat) comprehensible resources would be greatly appreciated, if nothing else.

The background is that I'm working on a simple C program that can create, propagate, and train dynamically sized, layered feed-forward ANNs. For the most part, everything's been going swimmingly. However, I'm having a little trouble wrapping my head around backpropogation. My main concern right now is how to use the backpropagation method for training a network that has multiple output neurons. All the examples/explanations I've found only use one output neuron. I'm assuming this is for the sake of simplicity. But, I'm wondering if, perhaps, this is because the backpropagation algorithm is only designed to works with one output neuron at a time. In other words, it can only be applied relative to one output neuron every time an feed-forward ANN is propagated. This would make sense, as it is known to be among the simplest ANN training methods.

Here are links to my aforesaid resources (or at least, the ones I found intuitive):

http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html

https://www.youtube.com/watch?v=GlcnxUlrtek

https://www.youtube.com/watch?v=IruMm7mPDdM (I'm actually not entirely certain as to whether or not this explanation accounted for multiple output neurons; if it did, it must've went over my head)

Thanks!

Solution

Backpropagation works for feed-forward ANNs with multiple output neurons.

For an output neuron $z_r$ in an ANN, the delta function is

$$
\delta_r = \frac{\partial f_r(a_r)}{\partial a_r} e_r\qquad,
$$
where $f_r(a_r)$ is the activation function of $z_r$, $a_r$ is the sum of weighted inputs of $z_r$ and $e_r$ is the error of $z_r$, $e_r$ = $z_r$ - $y_r$ ($y_r$ is the target value).

for a hidden neuron $z_k$ the delta function is the sum of weighted deltas of the neurons in the next layer (in forward direction) of $z_k$. So, for a neuron in the layer next to the output layer, the delta function is:

$$
\delta_k = \frac{\partial f_k(a_k)}{\partial a_k} \sum_{r\in post(k)}\delta_r w_{kr}\qquad,
$$

where $w_{kr}$ is the weight between neurons $z_k$ and $z_r$. I suggest to have a look at the Wikipedia article about Backpropagation, especially the subsection 'finding the derivative of the error'. Following the link at the end of this section (https://www4.rgu.ac.uk/files/chapter3%20-%20bp.pdf), the first five pages or so might be helpful for a better understanding.

Context

StackExchange Computer Science Q#35266, answer score: 4

Revisions (0)

No revisions yet.