HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

PostgreSQL Geo Spatial Query is slow

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlqueryslowgeospatial

Problem

After given up with MySQL i have tried Elasticsearch and now wan't to see if I could go with PostgreSQL/PostGIS that could allow me to go with PostgreSQL only.

I need to fetch records from a Table by distance (must not be exactly) and sort by distance. The table has 10 million records.

As I got for my Query on PostgreSQL slower speed then on MySQL I think I must do something wrong.

What I can do better?

Table:

id | hash_id | town | geo_pt2 

geo_pt2 is geography


Index:

CREATE INDEX geo_pt2_gix ON public.member_profile USING gist (geo_pt2)


Query:

SELECT hash_id, town
     , ST_Distance(t.x, geo_pt2) AS dist
FROM   member_profile, (SELECT ST_GeographyFromText('POINT(47.4667 8.3167)')) AS t(x)
WHERE  ST_DWithin(t.x, geo_pt2, 250000)
ORDER  BY dist
limit 100 offset 1000;


Explain:

Limit  (cost=9.08..9.08 rows=1 width=53)
  ->  Sort  (cost=9.07..9.08 rows=1 width=53)
        Sort Key: (_st_distance('0101000020E610000088855AD3BCBB474052499D8026A22040'::geography, member_profile.geo_pt2, '0'::double precision, true))
        ->  Index Scan using geo_pt2_gix on member_profile  (cost=0.42..9.06 rows=1 width=53)
              Index Cond: (geo_pt2 && '0101000020E610000088855AD3BCBB474052499D8026A22040'::geography)
              Filter: (('0101000020E610000088855AD3BCBB474052499D8026A22040'::geography && _st_expand(geo_pt2, '250000'::double precision)) AND _st_dwithin('0101000020E610000088855AD3BCBB474052499D8026A22040'::geography, geo_pt2, '250000'::double precision, true))


I'm using PostgreSQL 10 on a modern Server with high IOPS (NVMe) and the Query need 35 sec.

After suggestions of @Evan Carroll much better performance:

```
EXPLAIN ANALYZE SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography geo_pt2
OFFSET 10000
FETCH NEXT 100 ROWS ONL

Solution

First, use EXPLAIN ANALYZE (not mere EXPLAIN) and show the result of \d on the table.
(psql). As a first point this,

ST_GeographyFromText('POINT(47.4667 8.3167)')


should be written as ST_MakePoint(47.4667, 8.3167)::geography

Your problem here is this pattern,

SELECT ST_Distance( ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
...
ORDER  BY dist
LIMIT 100 OFFSET 1000;


Every time you do that, you must at least calculate the distance to 1100 rows. That said, it shouldn't be slow. It's slow because in order to that like that you're having to calculate the ST_Distance on all rows. We can stop that using KNN there using the ` operator. MySQL doesn't support KNN.

SELECT hash_id, town
     , ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM   member_profile
WHERE  ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER  BY ST_MakePoint(47.4667, 8.3167)::geography  geo_pt2
OFFSET 1000
FETCH NEXT 100 ROWS ONLY;


As a style critique, I personally prefer OFFSET/FETCH (the standardized method limit/offset).

Pagination

I'm not sure this will work. but, it may be worth trying (keep us updated).

SELECT hash_id, town
     , ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
     , ST_MakePoint(47.4667, 8.3167)::geography  geo_pt2 AS myknn
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
  AND ST_MakePoint(47.4667, 8.3167)::geography  geo_pt2 > OLD_VALUE
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography  geo_pt2
FETCH NEXT 100 ROWS ONLY;


So the first time you run this you save the value of last value of
myknn then the second time you run this, you can replay that value back in this clause as OLD_VALUE,

AND ST_MakePoint(47.4667, 8.3167)::geography  geo_pt2 > OLD_VALUE


So each time you run it you're saving the new point from which to continue at, and using
FETCH NEXT x ROWS ONLY.

myknn and dist` may be the same for you, if so you can just remove one of them.

Code Snippets

ST_GeographyFromText('POINT(47.4667 8.3167)')
SELECT ST_Distance( ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
...
ORDER  BY dist
LIMIT 100 OFFSET 1000;
SELECT hash_id, town
     , ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM   member_profile
WHERE  ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER  BY ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2
OFFSET 1000
FETCH NEXT 100 ROWS ONLY;
SELECT hash_id, town
     , ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
     , ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2 AS myknn
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
  AND ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2 > OLD_VALUE
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2
FETCH NEXT 100 ROWS ONLY;
AND ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2 > OLD_VALUE

Context

StackExchange Database Administrators Q#214782, answer score: 5

Revisions (0)

No revisions yet.