patternMajor
In algorithms for matrix multiplication (eg Strassen), why do we say n is equal to the number of rows and not the number of elements in both matrices?
Viewed 0 times
rowswhythenumberstrassenmultiplicationequalelementssayalgorithms
Problem
In Strassen's algorithm, we calculate the time complexity based on n being the number of rows of the square matrix. Why don't we take n to be the total number of entries in the matrices (so if we were multiplying two 2x2 matrices, we would have n = 8 for the four entries in each matrix)?
Then, using the naïve method of multiplying matrices, we would end up with only n multiplications and n/2 additions.
For instance, multiplying [1 2, 3 4] by [5 6, 7 8] yields [15+27 16+28, 35+47 36+48]. Here, n = 8 and we are doing n = 8 multiplications and n/2 = 4 additions.
So even a naïve multiplication algorithm would yield a time complexity of O(n).
Of course, this reasoning is wrong because the time complexity cannot be linear but I don't understand why.
I would appreciate any input. Thank you!
Then, using the naïve method of multiplying matrices, we would end up with only n multiplications and n/2 additions.
For instance, multiplying [1 2, 3 4] by [5 6, 7 8] yields [15+27 16+28, 35+47 36+48]. Here, n = 8 and we are doing n = 8 multiplications and n/2 = 4 additions.
So even a naïve multiplication algorithm would yield a time complexity of O(n).
Of course, this reasoning is wrong because the time complexity cannot be linear but I don't understand why.
I would appreciate any input. Thank you!
Solution
Here, n = 8 and we are doing n = 8 multiplications and n/2 = 4 additions. So even a naïve multiplication algorithm would yield a time complexity of O(n).
That is wrong. It might work for a small example like $n = 8$, but for bigger ones it doesn't.
E.g. if you have two square matrices with a total of $n = 200$ elements (two $10 \times 10$ matrices) you have already 1,000 multiplications, for $n=20\,000$ you have $1\,000\,000$.
Now to the actual question. It is possible to give the number time complexity in terms of the number of matrix elements, but it is not very practical.
One problem is, that the complexity differs a lot depending on the shape of the matrix:
So it's a lot harder to argue about time complexities when you only know the number of elements.
And it's also easier to talk about the matrices in general.
It feels more natural.
E.g. saying "I've doubled the number of rows of the first matrix" makes sense to everybody (and you immediately know that the algorithm will run about twice as slow, "I've doubled the both dimensions of both matrices" makes sense.
But saying "I've double the number of elements" will confuse anybody, how does the matrix look like afterwards. Did you increase both dimensions by a factor of 1.41? Did the matrix change shape? Or what happen? And what impact does it have on the runtime...
That is wrong. It might work for a small example like $n = 8$, but for bigger ones it doesn't.
E.g. if you have two square matrices with a total of $n = 200$ elements (two $10 \times 10$ matrices) you have already 1,000 multiplications, for $n=20\,000$ you have $1\,000\,000$.
Now to the actual question. It is possible to give the number time complexity in terms of the number of matrix elements, but it is not very practical.
One problem is, that the complexity differs a lot depending on the shape of the matrix:
- A multiplication of a $1 \times N$ matrix with a $N \times 1$ matrix has time complexity $O(N)$ (linear in the number of matrix elements).
- But a multiplication of a $N \times 1$ matrix with a $1 \times N$ matrix takes $O(N^2)$ (quadratic in the number of elements).
So it's a lot harder to argue about time complexities when you only know the number of elements.
And it's also easier to talk about the matrices in general.
It feels more natural.
E.g. saying "I've doubled the number of rows of the first matrix" makes sense to everybody (and you immediately know that the algorithm will run about twice as slow, "I've doubled the both dimensions of both matrices" makes sense.
But saying "I've double the number of elements" will confuse anybody, how does the matrix look like afterwards. Did you increase both dimensions by a factor of 1.41? Did the matrix change shape? Or what happen? And what impact does it have on the runtime...
Context
StackExchange Computer Science Q#156883, answer score: 20
Revisions (0)
No revisions yet.