patternsqlMinor
Quick search the most similar objects in the n-dimensional space
Viewed 0 times
thespacesearchobjectsquickdimensionalsimilarmost
Problem
Lets assume that we have a points in the n-dimensional space. So we have a n coords(n columns) which can describe location of the each point.
We need to implement a table which can be used for a quick searching the most similar points, i.e. points which have the smallest distance to the desired point.
E.g. points in the db:
If we want to find the best matching for point with coords:
We will get points with id 1 and 2.
We also have mean coordinate for each dimension(column) and vector for each point in which first element - number of the dimension in which point has the largest difference from the mean coordinate in this dimension, and last - number of the dimension in which point has the smallest difference. Maybe it can be used for the more rapid filtering points which have the biggest distance to the desired point.
So how can I do something like this using MySQL?
I think the composite index and
Any help will be very useful!
We need to implement a table which can be used for a quick searching the most similar points, i.e. points which have the smallest distance to the desired point.
E.g. points in the db:
id c1 c2 c3 c4 c5
1 5 19 42 12 16
2 3 23 38 15 12
3 14 21 32 33 1
4 12 29 21 24 5If we want to find the best matching for point with coords:
c1 c2 c3 c4 c5
4 20 40 14 15We will get points with id 1 and 2.
We also have mean coordinate for each dimension(column) and vector for each point in which first element - number of the dimension in which point has the largest difference from the mean coordinate in this dimension, and last - number of the dimension in which point has the smallest difference. Maybe it can be used for the more rapid filtering points which have the biggest distance to the desired point.
So how can I do something like this using MySQL?
I think the composite index and
order by abs(cx - $mycx) can be a good solution, but I can't use it because I will have more then 16 columns which I need to include in the one index.Any help will be very useful!
Solution
PostgreSQL using
Using PostgreSQL this becomes pretty easy with the
Finding distance...
Outputs.
cubeUsing PostgreSQL this becomes pretty easy with the
cube extension which I explained on your other question.. Sample data.CREATE TABLE foo AS
SELECT *
FROM ( VALUES
(1, cube(ARRAY[5 , 19, 42, 12, 16]) ),
(2, cube(ARRAY[3 , 23, 38, 15, 12]) ),
(3, cube(ARRAY[14, 21, 32, 33, 1 ]) ),
(4, cube(ARRAY[12, 29, 21, 24, 5 ]) )
) AS t(id, c);Finding distance...
SELECT
id,
c,
c2,
cube_distance(c,c2),
rank() OVER (ORDER BY cube_distance(c,c2))
FROM foo
CROSS JOIN ( SELECT cube(ARRAY[4,20,40,14,15]) )
AS t(c2)
ORDER BY cube_distance(c, c2);Outputs.
id | c | c2 | cube_distance | rank
----+---------------------+---------------------+------------------+------
1 | (5, 19, 42, 12, 16) | (4, 20, 40, 14, 15) | 3.3166247903554 | 1
2 | (3, 23, 38, 15, 12) | (4, 20, 40, 14, 15) | 4.89897948556636 | 2
4 | (12, 29, 21, 24, 5) | (4, 20, 40, 14, 15) | 26.5706605111728 | 3
3 | (14, 21, 32, 33, 1) | (4, 20, 40, 14, 15) | 26.8700576850888 | 4Code Snippets
CREATE TABLE foo AS
SELECT *
FROM ( VALUES
(1, cube(ARRAY[5 , 19, 42, 12, 16]) ),
(2, cube(ARRAY[3 , 23, 38, 15, 12]) ),
(3, cube(ARRAY[14, 21, 32, 33, 1 ]) ),
(4, cube(ARRAY[12, 29, 21, 24, 5 ]) )
) AS t(id, c);SELECT
id,
c,
c2,
cube_distance(c,c2),
rank() OVER (ORDER BY cube_distance(c,c2))
FROM foo
CROSS JOIN ( SELECT cube(ARRAY[4,20,40,14,15]) )
AS t(c2)
ORDER BY cube_distance(c, c2);id | c | c2 | cube_distance | rank
----+---------------------+---------------------+------------------+------
1 | (5, 19, 42, 12, 16) | (4, 20, 40, 14, 15) | 3.3166247903554 | 1
2 | (3, 23, 38, 15, 12) | (4, 20, 40, 14, 15) | 4.89897948556636 | 2
4 | (12, 29, 21, 24, 5) | (4, 20, 40, 14, 15) | 26.5706605111728 | 3
3 | (14, 21, 32, 33, 1) | (4, 20, 40, 14, 15) | 26.8700576850888 | 4Context
StackExchange Database Administrators Q#158653, answer score: 3
Revisions (0)
No revisions yet.