patterncMinor

Mandelbrot fractal with MPI

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

fractals codereview c stackoverflow multiprocessing mpi performance

fractalwithmpimandelbrot

Problem

I'm not very familiar with MPI. I've written this little piece of code to draw the Mandelbrot fractal processing each row of the image in parallel. It works but it's really slow. I've written code to perform the same task with OpenMP and with no parallelization, and both run instantly with the default parameters, whereas the MPI version runs in about 2s in my computer. Not only that, the OpenMP version used 33% less code and less memory. I suppose some of this improvements come from the fact that using threads for this task might be better (the workload distribution was different with threads too, not one line per thread), but I was wondering if I could improve my MPI code in any way.

```
/*
* mandelbrot.c
*
* Draws the Mandelbrot fractal using MPI. Optionally write it to a csv file.
*
* Compiling instructions:
* mpicc mandelbrot.c -std=c99 -o output_file
*
* Running instructions:
* mpirun -np NUM_RANKS output_file [FNAME] [W] [H] [MAX_ITER] [X0] [Y0] [XN]
* [YN]
*
* For every integer point in [(0, 0), (W, H)], check if its mapping to
* [(X0, Y0), (XN, YN)] falls in the Mandelbrot set with at most MAX_ITER.
* Write the results to a CSV file called FNAME only if the argument FNAME
is passed and valid (invalid e.g. on nix: "/"). For defaults, see main().
*
* Note: NUM_RANKS must be >= 2
*
* Note: W, H and MAX_ITER are unsigned but bounded by the max value of int.
* X0, Y0, XN and YN are floats.
*
*/

#include
#include
#include
#include "mpi.h"

const int MASTER = 0, TAG = 1;
int STOP = -1;

unsigned int bailout(float complex z, unsigned int max_iters) {
/*
* Returns i = 3) w = (unsigned)atoi(argv[2]);
if (argc >= 4) h = (unsigned)atoi(argv[3]);
if (argc >= 5) max_iters = (unsigned)atoi(argv[4]);
if (argc >= 6) x0 = atof(argv[5]);
if (argc >= 7) y0 = atof(argv[6]);
if (argc >= 8) xn = atof(argv[7]);
if (argc >= 9) yn = atof(argv[8]);
float scale_x = (xn - x0) / (float)w;
float scale_y = (yn - y0) / (float)h;
unsigned

Solution

I can see two reasons why your program might be slow.

The first one is that for each row you allocate memory and then free it. If you move the allocation outside of the while loop and reuse the memory then this will get rid of a whole bunch of unnecessary memory allocation operations.

The second one is probably more important: MPI stands for Message Passing Interface. This means that communication between nodes is done by passing messages around. The easiest to imagine an MPI program is to assume that each node is a computer and that the program is running on all these computers in parallel and that the programs can exchange messages over a network.

Hence one goal for a performant MPI program is to minimize the number of message exchanges while maximising the work each individual node is performing.

What you have done is to split your problem in very small chunks which you are then distributing to the workers. What you should do instead: If you have N workers then split up your program into N parts and send each part to a worker. This is essentially the minimal number of messages to exchange if you want all N workers to participate in the solving of the problem.

So rather than sending one row at a time send height / N rows to each worker, let them compute all of them and gather the result back.

Context

StackExchange Code Review Q#78124, answer score: 5

Revisions (0)

No revisions yet.