patternMinor

High Dimensional Data Structures

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

big-data cs stackoverflow data-structures

dimensionalhighstructuresdata

Problem

I have a 20-dimensional dataset, with a large amount of data points. I would like to have each dimension discretized into bins. Per bin, I would like to be able to access two neighbours per dimension (i.e. +1 and -1 per dimension). Basically I want to be able to easily access 2*d (where d is the dimensionality) neighbours. For lower dimensions, this would be quite easy to do using a multidimensional array (i.e. for point data[0][1][2] I would access its neighbours data[0][1][3] and data[0][1][1] for the third dimension). However, when this approach is scaled up to higher dimensions, memory becomes an issue.

What kind of data structures would be suitable to use, where the most important criterium is the quick and easy access to its +1 and -1 neighbours?

Solution

I would suggest you to use a hierarchical K-means, or a hierarchical vocabulary tree approach. For example:

http://gecco.org.chemie.uni-frankfurt.de/hkmeans/H-k-means.pdf
http://www.vlfeat.org/overview/hikm.html

CVPR Paper with application:
http://www.vis.uky.edu/~stewe/publications/nister_stewenius_cvpr2006.pdf

FLANN also has some kind of hierarchical capabilities.

Context

StackExchange Computer Science Q#26435, answer score: 2

Revisions (0)

No revisions yet.