patterncppMinor

CUDA Kernel - Neural Net

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

neuralkernelcudanet

Problem

I'm building a spiking neural net (recurrent, integrate and fire), and I'm curious about how to reduce the warp divergence (and other problems) I may have.
Here's an example with a few hand-placed neurons and synapses for a better apprehension. I upload the whole code on a Git repo for faster access, make tui then ./cudasnn to try.

The very basic workflow is to execute the 4 kernels (explanation in the comments) one after another.

After 5000 cycles, here's the time in ms for each kernel in order:

1 - 0.196ms

2 - 3.558ms

3 - 0.038ms

4 - 4.416ms to 6.278ms

My code is split into three files, whose name are pretty explicit.

NN.hpp (which contains my structures)

```
#ifndef NN_HPP
# define NN_HPP

# include

/ Window Setting -- for SFML, no need here /
# define WIDTH 1280
# define HEIGHT 720
# define XOFFSET -0
# define YOFFSET -95

/ Network Settings /
# define STIMULUS 1
# define INHIBITION -1
# define STIMULUS_RATIO 80
# define SPIKE_BUFFER 4

# define INPUT 0
# define OUTPUT 1
# define HIDDEN 2

typedef struct s_neuron_info
{
float x, y, z;
char gid;
unsigned char group; // hidden, input, output
} t_neuron_info;

typedef struct s_neuron
{
short in_time;
float in_value;
int next_time;
float action_potential;
float threshold; // fire when threshold reached
float weight;
char type; // stimulus, inhibition
char carry;
} t_neuron;

typedef struct s_synapse
{
int id_in;
int id_out;
int axonal_delay; // in timestep
} t_synapse;

typedef struct s_spike
{
int syn_id;
int id_out;
int start_t, end_t; /

Solution

As a first step, remove as many conditional branches as possible. Take a functional programming approach.
You added a lot of conditional returns for error checking that can be removed if your arrays are set up to accommodate all inputs.

Conversion to functional programming example:

if (n[idx].carry)
{
  n[idx].action_potential = 0.0f;
  n[idx].carry = 0;
}

becomes:

n[idx].action_potential = n[idx].action_potential - (n[idx].carry * n[idx].action_potential);
n[idx].carry = 0;

Code Snippets

if (n[idx].carry)
{
  n[idx].action_potential = 0.0f;
  n[idx].carry = 0;
}

n[idx].action_potential = n[idx].action_potential - (n[idx].carry * n[idx].action_potential);
n[idx].carry = 0;

Context

StackExchange Code Review Q#95874, answer score: 2

Revisions (0)

No revisions yet.