HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Postgresql subquery speed much slower than individual queries

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqlmuchindividualthanslowersubqueryqueriesspeed

Problem

In Postgres 9.1. I am attempting to do a single query to get the timezone of a given postal code. I have data loaded in my database, and I am using PostGIS to store the coordinates of each postal code. Here are the two queries that I need to do in order to get the data I need:

SELECT coord FROM public.postal_code WHERE postal_code = 'T1K0T4' LIMIT 1


query time: 6ms

This gives me the coordinate of the center of the postal_code area. Then I use this coordinate to find which timezone it intersects:

SELECT *
FROM public.timezones as tz
WHERE ST_Intersects(ST_GeomFromText('POINT(-112 49)',4326),geom)


query time: 36ms

Now when I combine the 2 queries, the query time jumps to 7-8 seconds. This is my query:

SELECT *
FROM public.timezones as tz
WHERE ST_Intersects((SELECT coord FROM taduler.postal_code
                     WHERE postal_code = 'T1K0T4' LIMIT 1),geom)


I have a spatial index on the coord column in the postal_code table, and also on the geom column in the timezone table, but it seems like it isn't being used for the subquery.

Does anyone know of a better way to optimize this query? I have tried several variations of this query, like joining the tables and such, but they have all resulted in the same query speed.

Output from EXPLAIN ANALYZE:

Seq Scan on timezones tz  (cost=8.37..167.47 rows=136 width=335547) (actual time=4606.136..7274.428 rows=1 loops=1)
  Filter: st_intersects($0, (geom)::geography)
  InitPlan 1 (returns $0)
    ->  Limit  (cost=0.00..8.37 rows=1 width=128) (actual time=0.011..0.011 rows=1 loops=1)
          ->  Index Scan using postal_code_idx on postal_code  (cost=0.00..8.37 rows=1 width=128) (actual time=0.010..0.010 rows=1 loops=1)
                Index Cond: ((postal_code)::text = 'T1K0T4'::text)
Total runtime: 7274.448 ms


EXPLAIN ANALYZE for the following query:

```
SELECT *
FROM public.timezones as tz
JOIN taduler.postal_code as pc on ST_Intersects(pc.coord, tz.geom)
WHERE pc.postal

Solution

You are running in a weakness of the query planner: That index cannot be used in a plain join. Had a similar problem here:

  • Algorithm for finding the longest prefix (Chapter "Failed attempt with text_pattern_ops")



In Postgres 9.3 or later use a LATERAL join instead:

SELECT *
FROM  (
    SELECT coord
    FROM   taduler.postal_code
    WHERE  postal_code = 'T1K0T4'
    LIMIT  1
    ) pc
LEFT JOIN LATERAL (
    SELECT *
    FROM   public.timezones tz
    WHERE  ST_Intersects(pc.coord, tz.geom)
   ) tz ON true;


See:

  • What is the difference between LATERAL JOIN and a subquery in PostgreSQL?



In Postgres 9.1, it might help to encapsulate the first query in a CTE, but I am not sure:

WITH pc AS (
    SELECT coord
    FROM   taduler.postal_code
    WHERE  postal_code = 'T1K0T4'
    LIMIT  1
    )
SELECT *
FROM   pc
JOIN   public.timezones tz ON ST_Intersects(pc.coord, tz.geom);


A PL/pgSQL function encapsulating two separate queries certainly does the trick:

CREATE OR REPLACE FUNCTION f_get_tz(_pc text)
  RETURNS SETOF public.timezones
  LANGUAGE plpgsql AS
$func$
DECLARE
   _coord geom;
BEGIN
   SELECT coord
   INTO  _coord
   FROM   taduler.postal_code
   WHERE  postal_code = _pc
   LIMIT  1;

   RETURN QUERY
   SELECT *
   FROM   public.timezones tz
   WHERE  ST_Intersects(_coord, tz.geom);
END
$func$;


Call:

SELECT * FROM f_get_tz('T1K0T4');

Code Snippets

SELECT *
FROM  (
    SELECT coord
    FROM   taduler.postal_code
    WHERE  postal_code = 'T1K0T4'
    LIMIT  1
    ) pc
LEFT JOIN LATERAL (
    SELECT *
    FROM   public.timezones tz
    WHERE  ST_Intersects(pc.coord, tz.geom)
   ) tz ON true;
WITH pc AS (
    SELECT coord
    FROM   taduler.postal_code
    WHERE  postal_code = 'T1K0T4'
    LIMIT  1
    )
SELECT *
FROM   pc
JOIN   public.timezones tz ON ST_Intersects(pc.coord, tz.geom);
CREATE OR REPLACE FUNCTION f_get_tz(_pc text)
  RETURNS SETOF public.timezones
  LANGUAGE plpgsql AS
$func$
DECLARE
   _coord geom;
BEGIN
   SELECT coord
   INTO  _coord
   FROM   taduler.postal_code
   WHERE  postal_code = _pc
   LIMIT  1;

   RETURN QUERY
   SELECT *
   FROM   public.timezones tz
   WHERE  ST_Intersects(_coord, tz.geom);
END
$func$;
SELECT * FROM f_get_tz('T1K0T4');

Context

StackExchange Database Administrators Q#61646, answer score: 4

Revisions (0)

No revisions yet.