patternsqlMinor
Postgres WITH not performed first
Viewed 0 times
performedpostgreswithfirstnot
Problem
I am working in PostgreSQL 9.5 on a SQL query that pulls from two views. Individually, the views take 93 msec and 530 msec to perform, resulting in ~1500 and ~6200 rows, respectively. This seems reasonable given the processing going into the view, which includes some spatial intersections with PostGIS. However, when I perform the query below, it takes ~5 seconds to perform.
From what I understand, using
If I link to the two views through ODBC/MS Access and use MS Access to perform the query without
No other CTEs in the views, just subqueries.
Is there a better way to do this, or a hack to force the order of execution?
Execution plan of full query:
`"Hash Join (cost=963.49..1007.02 rows=9 width=120) (actual time=40.508..54.209 rows=45392 loops=1)"
" Hash Cond: (sr.sample_pk = eu.sample_pk)"
" CTE eu"
" -> Nested Loop (cost=41.81..50.67 rows=1 width=24) (actual time=1.296..8.307 rows=1538 loops=1)"
" Join Filter: (lil.locations_pk = l.locations_pk)"
" -> Nested Loop (cost=41.67..48.22 rows=1 width=28) (actual time=1.289..6.582 rows=1538 loops=1)"
" -> Nested Loop (cost=41.53..47.92 rows=1 width=32) (actual time=1.283..4.852 rows=1538 loops=1)"
" -> Hash Join (cost=41.38..47.66 rows=1 width=28) (actual time=1.275..3.079 rows=1480 loops=1)"
From what I understand, using
WITH should force the subqueries to be executed once each prior to evaluation in the query below, which should result in a total time under one second rather than five seconds.If I link to the two views through ODBC/MS Access and use MS Access to perform the query without
WITH, it executes in less than a second. This makes me think that the subqueries are not being performed once each prior to joining when I execute them in pgAdmin.No other CTEs in the views, just subqueries.
Is there a better way to do this, or a hack to force the order of execution?
WITH
eu AS (
SELECT
e.sample_pk,
.....
FROM
public.samples_exp_unit AS e
),
sr AS (
SELECT
r.sample_pk,
.....
FROM
public.sample_results AS r
)
SELECT
sr.sample_pk,
eu.inventory_pk,
.....
FROM sr INNER JOIN eu
ON sr.sample_pk = eu.sample_pkExecution plan of full query:
`"Hash Join (cost=963.49..1007.02 rows=9 width=120) (actual time=40.508..54.209 rows=45392 loops=1)"
" Hash Cond: (sr.sample_pk = eu.sample_pk)"
" CTE eu"
" -> Nested Loop (cost=41.81..50.67 rows=1 width=24) (actual time=1.296..8.307 rows=1538 loops=1)"
" Join Filter: (lil.locations_pk = l.locations_pk)"
" -> Nested Loop (cost=41.67..48.22 rows=1 width=28) (actual time=1.289..6.582 rows=1538 loops=1)"
" -> Nested Loop (cost=41.53..47.92 rows=1 width=32) (actual time=1.283..4.852 rows=1538 loops=1)"
" -> Hash Join (cost=41.38..47.66 rows=1 width=28) (actual time=1.275..3.079 rows=1480 loops=1)"
Solution
Since
You can rewrite to be simpler:
and test if this changes the plan and improves efficiency.
Another thing to test is to avoid CTEs. Especially because they are an optimization fence, they sometimes result in less efficient plans (and sometimes in more efficient plans!) Rewrite and test without a CTE:
Also try the queries through psql. It will show if PgAdmin is the issue.
It could be that pgAdmin is taking a long time to render the results of the query, or is talking over a slow network and so takes a long time to fetch the results. It is also possible pgAdmin is doing something that changes the execution plan, like silently setting some parameters to non-default values, or preparing the query into a cursor before execution.
public.samples_exp_unit and public.sample_results are the views, you have added a level of indirection in your queries.You can rewrite to be simpler:
WITH
eu AS (TABLE public.samples_exp_unit),
sr AS (TABLE public.sample_results)
SELECT
sr.sample_pk,
eu.inventory_pk,
.....
FROM sr INNER JOIN eu
ON sr.sample_pk = eu.sample_pk ;and test if this changes the plan and improves efficiency.
Another thing to test is to avoid CTEs. Especially because they are an optimization fence, they sometimes result in less efficient plans (and sometimes in more efficient plans!) Rewrite and test without a CTE:
SELECT
sr.sample_pk,
eu.inventory_pk,
.....
FROM public.sample_results AS sr
INNER JOIN
public.samples_exp_unit AS eu
ON sr.sample_pk = eu.sample_pk ;Also try the queries through psql. It will show if PgAdmin is the issue.
It could be that pgAdmin is taking a long time to render the results of the query, or is talking over a slow network and so takes a long time to fetch the results. It is also possible pgAdmin is doing something that changes the execution plan, like silently setting some parameters to non-default values, or preparing the query into a cursor before execution.
Code Snippets
WITH
eu AS (TABLE public.samples_exp_unit),
sr AS (TABLE public.sample_results)
SELECT
sr.sample_pk,
eu.inventory_pk,
.....
FROM sr INNER JOIN eu
ON sr.sample_pk = eu.sample_pk ;SELECT
sr.sample_pk,
eu.inventory_pk,
.....
FROM public.sample_results AS sr
INNER JOIN
public.samples_exp_unit AS eu
ON sr.sample_pk = eu.sample_pk ;Context
StackExchange Database Administrators Q#174472, answer score: 7
Revisions (0)
No revisions yet.