patternMinor
Why Is KD-Tree-based Nearest Neighbor Exponential in K?
Viewed 0 times
neighborwhynearestexponentialbasedtree
Problem
I've read in many papers on higher-dimensional nearest neighbor search that KD-Trees are exponential in K, but I can't seem to determine why.
What I'm looking for is a solid runtime-complexity analysis which explains this aspect of the problem.
What I'm looking for is a solid runtime-complexity analysis which explains this aspect of the problem.
Solution
kNN tends to be exponential because the search space increases with $2^k$. Imagine you partition the space around your search point into quadrants. For k=1 you just have to search two 'quadrants' (higher and lower values), for k=2 it's 4 quadrants, for k=3 it's 8 quadrants, i.e. exponential growth of search space. That is what the kD-tree suffers from, because it has to search $2^k$ sub-branches.
Other trees perform much better, for example the CoverTree. I also found that the PH-Tree works quite well, it seems to consistently take twice as long as the CoverTree for datasets between k=8 and k=27 (I didn't have datasets with higher k).
Other trees perform much better, for example the CoverTree. I also found that the PH-Tree works quite well, it seems to consistently take twice as long as the CoverTree for datasets between k=8 and k=27 (I didn't have datasets with higher k).
Context
StackExchange Computer Science Q#53406, answer score: 2
Revisions (0)
No revisions yet.