HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Are there absolute reasons to prefer row/column-major memory ordering?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
prefermajorcolumnareorderingabsolutereasonsmemorythererow

Problem

I've heard it said that "fortran uses column-major ordering because it's faster" but I'm not sure that's true. Certainly, matching column-major data to a column-major implementation will outperform a mixed setup, but I'm curious if there's any absolute reason to prefer row- or column-major ordering. To illustrate the idea, consider the following thought experiment experiment about three of the most common (mathematical) array operations:
Vector-vector inner products

We want to compute the inner product between two equivalent-length vectors, a and b:
$$
b = \sum_i a_i x_i.
$$
In this case, both a and b are "flat"/one-dimensional and accessed sequentially, so there's really no row- or column-major consideration.

Conclusion: Memory ordering doesn't matter.
Matrix-vector inner products

$$
b_i = \sum_j A_{ij} x_j
$$

The naive multiplication algorithm traverses "across" A and "down" x. Again, x is already flat so sequential elements are always adjacent, but adjacent elements in A's rows are most often accessed together (and I suspect this is likely true for more sophisticated multiplication algorithms like the Strassen or Coppersmith-Winograd algorithms).

Conclusion: Row-major ordering is preferred.

(If you let vectors have transposes you can define a left-multiplication of matrices, $x^T A$, in which case column-major does become preferable, but I think it's conceptually simpler to keep vectors transposeless and define this as $A^T x$.)
Matrix-matrix inner products

$$
B_{ik} = \sum_{j} A_{ij} X_{jk}
$$

One more time, the schoolbook algorithm traverses across A and down X, so one of those traversals will always be misaligned with the memory layout.

Conclusion: Memory ordering doesn't matter.
Additional consideration: strings & text

ASCII (or similar) strings are most frequently read across-and-down. There's a lot more to consider since a multidimensional array of characters could be ragged (different length rows, e.g. in storing the lines of

Solution

Whether row-major or column-major order is more efficient, depends on the storage access patterns of a specific application.

The underlying principle of computing is that accessing storage in sequential locations tends to be the most efficient pattern possible, whereas accessing storage at disparate locations incurs an overhead in seeking to the data on each iteration, so organising the storage to suit the typical algorithms performed on the data by a particular application, can result in a performance gain.

It's also worth considering what we mean by rows and columns. By a "row" we typically mean a set of fields that relate to one logical/conceptual entity - a row contains fields (in a hierarchical relationship). By a "column", we typically mean a set of fields that share a common meaning or type, but where each field relates to separate logical/conceptual entities - a column is a cross-cut of fields taken from multiple logical entities.

I suspect row-major ordering tends more often to be the default, because it is more common for algorithms to want to access the related fields of the same logical entity at once, than it is for them to want to access fields with the same meaning but across different entities at once.

I suspect also, given the definition of rows and columns above, that row-major aligns with how programmers are most readily inclined to think about accessing data - it's most likely to accord with their mental model of how data is organised. Deviating to column-major is something you then do for a specific performance or algorithmic reason, not by default.

Context

StackExchange Computer Science Q#153475, answer score: 4

Revisions (0)

No revisions yet.