HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

How approximate are "approximate" nearest neighbor (ANN) search algorithms?

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
neighborapproximatesearcharenearestannalgorithmshow

Problem

Starting to use nanoflann to do some point cloud nearest neighbor searching and it got me thinking about just how "approximate" ANN methods are.

If I have a (more or less) randomly distributed point cloud what is the likelihood that I get the exact nearest neighbor given a target point within the clouds bounding box? I know that it is dataset dependent... but does anyone have a good numerical study somewhere that shows trends?

Solution

nanoflann is a library for building KD-trees. For the best bin first (BBF) approximate nearest neighbor algorithm (which is based on KD-trees), the authors report the following performance results:


In 12 dimensions, for example, BBF recovers the closest neighbor 94%
of the time (versus 59% for restricted search) while on average
examining only 200 of the 100,000 leaf nodes. For this same case, the
“exact” search has to examine over 2400 leaves. [Even] when the method
does not discover the exact nearest neighbor, it does not fail by
much. Even up to dimension 20, the average distance of recovered
neighbors is only 2% greater than the true NN distances.

However, that paper does not contain a formal analysis of the error bounds in this particular $\varepsilon$-approximation.

Context

StackExchange Computer Science Q#12310, answer score: 2

Revisions (0)

No revisions yet.