HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

What's the complexity of Spearman's rank correlation coefficient computation?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
therankwhatcomputationcoefficientspearmancomplexitycorrelation

Problem

I've been studying the Spearman's rank correlation coefficient

$\qquad \displaystyle \rho = \frac{\sum_i(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\sum_i (x_i-\bar{x})^2 \sum_i(y_i-\bar{y})^2}}$.

for two lists $x_1, \dots, x_n$ and $y_1, \dots, y_n$. What's the complexity of the algorithm?

Since the algorithm should just compute $n$ subtractions, is it possible to be $O(n)$ ?

Solution

You have to compute

  • two averages,



  • $2n$ differences,



  • three sums with $n$ summands -- which can be computed in constant time -- each and



  • one division, one multiplication and one square root.



All of this can be done in linear time if we assume elementary arithmetic operations run in constant time, therefore total time in $O(n)$ is certainly possible. Note that computing the root might screw things up.

Regarding space, you have several options:

  • Store only the averages, that is two numbers ($O(\log M)$ with $M$ the maximum number). You have to recompute all differences, that is perform a total of $6n$ subtractions.



  • Store the average and the differences, that is $2n+2$ numbers ($O(n \cdot \log M)$). This saves you $4n$ subtractions.



Which is preferable depends on your context.

Context

StackExchange Computer Science Q#2604, answer score: 9

Revisions (0)

No revisions yet.