HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

How does Yelp efficiently calculate distance in the database?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
efficientlytheyelpdistancedatabasecalculatedoeshow

Problem

For example, say I have a table:

Business(BusinessID, Lattitude, Longitude)


All are indexed of course. Also there are 1 million records

Say I want to find businesses closest to 106,5, for example, how would I do so?

If I do

SELECT *
FROM Business
WHERE (Some formula to compute distance here) < 2000


for example, or if I do

SELECT *
FROM Business
TOP 20


In theory the computer will have to compute distance for all biz while in practice only those with lattitude and longitude within a certain range that should be computed.

So how can I do what I want in PhP, or SQL, for example?

I am grateful with the answer so far. I am using mysql and they do not have anything more efficient than the obvious solution. MySQL spatial do not have compute distance function either.

Solution

If I understand the question correctly (and I'm not sure I do), you are worried about computing "(Some formula to compute distance here)" for every row in the table each time you do a query?

This can be mitigated to a degree by using the indexes on latitude and longitude so we only have to compute the distance for a 'box' of points containing the circle we actually want:

select * from business
where (latitude>96 and latitude-5 and longitude<15) and 
      (Some formula to compute distance here) < 2000


Where 96, 116 etc are chosen to match the unit of the value '2000' and the point on the globe you are calculating distances from.

How precisely this uses indexes will depend on your RDBMS and the choices its planner makes.

In general terms, this is a primitive way of optimising a kind of nearest neighbour search. If your RDBMS supports GiST indexes, like postgres then you should consider using them instead.

Code Snippets

select * from business
where (latitude>96 and latitude<116) and 
      (longitude>-5 and longitude<15) and 
      (Some formula to compute distance here) < 2000

Context

StackExchange Database Administrators Q#4210, answer score: 8

Revisions (0)

No revisions yet.