patternsqlMinor
PostgreSQL Geo Spatial Query is slow
Viewed 0 times
postgresqlqueryslowgeospatial
Problem
After given up with MySQL i have tried Elasticsearch and now wan't to see if I could go with PostgreSQL/PostGIS that could allow me to go with PostgreSQL only.
I need to fetch records from a Table by distance (must not be exactly) and sort by distance. The table has 10 million records.
As I got for my Query on PostgreSQL slower speed then on MySQL I think I must do something wrong.
What I can do better?
Table:
Index:
Query:
Explain:
I'm using PostgreSQL 10 on a modern Server with high IOPS (NVMe) and the Query need 35 sec.
After suggestions of @Evan Carroll much better performance:
```
EXPLAIN ANALYZE SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography geo_pt2
OFFSET 10000
FETCH NEXT 100 ROWS ONL
I need to fetch records from a Table by distance (must not be exactly) and sort by distance. The table has 10 million records.
As I got for my Query on PostgreSQL slower speed then on MySQL I think I must do something wrong.
What I can do better?
Table:
id | hash_id | town | geo_pt2
geo_pt2 is geographyIndex:
CREATE INDEX geo_pt2_gix ON public.member_profile USING gist (geo_pt2)Query:
SELECT hash_id, town
, ST_Distance(t.x, geo_pt2) AS dist
FROM member_profile, (SELECT ST_GeographyFromText('POINT(47.4667 8.3167)')) AS t(x)
WHERE ST_DWithin(t.x, geo_pt2, 250000)
ORDER BY dist
limit 100 offset 1000;Explain:
Limit (cost=9.08..9.08 rows=1 width=53)
-> Sort (cost=9.07..9.08 rows=1 width=53)
Sort Key: (_st_distance('0101000020E610000088855AD3BCBB474052499D8026A22040'::geography, member_profile.geo_pt2, '0'::double precision, true))
-> Index Scan using geo_pt2_gix on member_profile (cost=0.42..9.06 rows=1 width=53)
Index Cond: (geo_pt2 && '0101000020E610000088855AD3BCBB474052499D8026A22040'::geography)
Filter: (('0101000020E610000088855AD3BCBB474052499D8026A22040'::geography && _st_expand(geo_pt2, '250000'::double precision)) AND _st_dwithin('0101000020E610000088855AD3BCBB474052499D8026A22040'::geography, geo_pt2, '250000'::double precision, true))I'm using PostgreSQL 10 on a modern Server with high IOPS (NVMe) and the Query need 35 sec.
After suggestions of @Evan Carroll much better performance:
```
EXPLAIN ANALYZE SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography geo_pt2
OFFSET 10000
FETCH NEXT 100 ROWS ONL
Solution
First, use
(psql). As a first point this,
should be written as
Your problem here is this pattern,
Every time you do that, you must at least calculate the distance to 1100 rows. That said, it shouldn't be slow. It's slow because in order to that like that you're having to calculate the
EXPLAIN ANALYZE (not mere EXPLAIN) and show the result of \d on the table.(psql). As a first point this,
ST_GeographyFromText('POINT(47.4667 8.3167)')should be written as
ST_MakePoint(47.4667, 8.3167)::geographyYour problem here is this pattern,
SELECT ST_Distance( ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
...
ORDER BY dist
LIMIT 100 OFFSET 1000;Every time you do that, you must at least calculate the distance to 1100 rows. That said, it shouldn't be slow. It's slow because in order to that like that you're having to calculate the
ST_Distance on all rows. We can stop that using KNN there using the ` operator. MySQL doesn't support KNN.
SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography geo_pt2
OFFSET 1000
FETCH NEXT 100 ROWS ONLY;
As a style critique, I personally prefer OFFSET/FETCH (the standardized method limit/offset).
Pagination
I'm not sure this will work. but, it may be worth trying (keep us updated).
SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
, ST_MakePoint(47.4667, 8.3167)::geography geo_pt2 AS myknn
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
AND ST_MakePoint(47.4667, 8.3167)::geography geo_pt2 > OLD_VALUE
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography geo_pt2
FETCH NEXT 100 ROWS ONLY;
So the first time you run this you save the value of last value of myknn then the second time you run this, you can replay that value back in this clause as OLD_VALUE,
AND ST_MakePoint(47.4667, 8.3167)::geography geo_pt2 > OLD_VALUE
So each time you run it you're saving the new point from which to continue at, and using FETCH NEXT x ROWS ONLY.
myknn and dist` may be the same for you, if so you can just remove one of them.Code Snippets
ST_GeographyFromText('POINT(47.4667 8.3167)')SELECT ST_Distance( ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
...
ORDER BY dist
LIMIT 100 OFFSET 1000;SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2
OFFSET 1000
FETCH NEXT 100 ROWS ONLY;SELECT hash_id, town
, ST_Distance(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2) AS dist
, ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2 AS myknn
FROM member_profile
WHERE ST_DWithin(ST_MakePoint(47.4667, 8.3167)::geography, geo_pt2, 250000)
AND ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2 > OLD_VALUE
ORDER BY ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2
FETCH NEXT 100 ROWS ONLY;AND ST_MakePoint(47.4667, 8.3167)::geography <-> geo_pt2 > OLD_VALUEContext
StackExchange Database Administrators Q#214782, answer score: 5
Revisions (0)
No revisions yet.