patternMinor
Why is memory one dimensional?
Viewed 0 times
onewhydimensionalmemory
Problem
Addresses in system programming languages like C are one dimensional (i.e. one number). This forces the programmer to make a decision whether matrices are stored "row major" or "column major" causing the opposite access to be slower.
This is somewhat surprising, since I think I remember that physical RAM is actually a 2 dimensional structure. And looking at the free part of this article. It seems like there exists a burst mode for accessing multiple elements from the same row.
Why does this burst mode not exist for columns, and why do programming languages not allow for true two dimensional storage of data? This would allow for "row major"/"column major" agnostic designs, which would probably speed up a lot of linear algebra libraries. Which would then trickle down to statistics and machine learning.
(Crosspost from StackOverflow)
This is somewhat surprising, since I think I remember that physical RAM is actually a 2 dimensional structure. And looking at the free part of this article. It seems like there exists a burst mode for accessing multiple elements from the same row.
Why does this burst mode not exist for columns, and why do programming languages not allow for true two dimensional storage of data? This would allow for "row major"/"column major" agnostic designs, which would probably speed up a lot of linear algebra libraries. Which would then trickle down to statistics and machine learning.
(Crosspost from StackOverflow)
Solution
Conventional RAM is organized in rows and columns. The CPU selects one row, then the RAM hardware gathers all cells in columns within that row and delivers them to the CPU.
If you were able to also select a column and gather all cells within that column, you would get a different set of bits. No guarantee that they are ordered in any useful way for you. No guarantee that for example a 64 bit integer would come out in sequential bits, no guarantee even for a byte. And the hardware needed is at least doubled, which likely means half the memory for the same price. And I can’t see how this would help you with a 100x100 array.
Where I have seen something similar is in graphics cards. Much drawing is done in a small rectangular area, so you want a cache line not to contain a long stripe, but a rectangular area of pixels. You start with x and y coordinates, then the lower 4 bit of x and y address bytes within a 256 byte cache line, and the other bits address cache lines. As a result, a 16 pixel long line in any direction can fit into a cache line, unlike normal arrangement where for a vertical line each pixel would be in its own cache line.
If you were able to also select a column and gather all cells within that column, you would get a different set of bits. No guarantee that they are ordered in any useful way for you. No guarantee that for example a 64 bit integer would come out in sequential bits, no guarantee even for a byte. And the hardware needed is at least doubled, which likely means half the memory for the same price. And I can’t see how this would help you with a 100x100 array.
Where I have seen something similar is in graphics cards. Much drawing is done in a small rectangular area, so you want a cache line not to contain a long stripe, but a rectangular area of pixels. You start with x and y coordinates, then the lower 4 bit of x and y address bytes within a 256 byte cache line, and the other bits address cache lines. As a result, a 16 pixel long line in any direction can fit into a cache line, unlike normal arrangement where for a vertical line each pixel would be in its own cache line.
Context
StackExchange Computer Science Q#123565, answer score: 2
Revisions (0)
No revisions yet.