patternsqlMinor
manual execution of specific remote query with postgresql_fdw
Viewed 0 times
withquerymanualspecificremoteexecutionpostgresql_fdw
Problem
In 9.4b2,
It would obviously be much more efficient to send
Is there a way to perform this optimization manually? I would be satisfied with something as low-level as (hypothetically speaking)
although of course a higher-level construct would be preferred.
I'm aware that I could do something like this with
EDIT: Here's the query plan for Erwin Brandstetter's suggestion:
This is better, in that it fetches only `c
postgresql_fdw doesn't know how to "push down" aggregate queries on remote tables, e.g.> explain verbose select max(col1) from remote_tables.table1;
QUERY PLAN
---------------------------------------------------------------------------------------------
Aggregate (cost=605587.30..605587.31 rows=1 width=4)
Output: max(col1)
-> Foreign Scan on remote_tables.table1 (cost=100.00..565653.20 rows=15973640 width=4)
Output: col1, col2, col3
Remote SQL: SELECT col1 FROM public.table1It would obviously be much more efficient to send
SELECT max(col1) FROM public.table1 to the remote server and just pull the one row back.Is there a way to perform this optimization manually? I would be satisfied with something as low-level as (hypothetically speaking)
EXECUTE 'SELECT max(col1) FROM public.table1' ON remote RETURNING (col1 INTEGER);although of course a higher-level construct would be preferred.
I'm aware that I could do something like this with
dblink, but that would involve rewriting a large body of code that already uses foreign tables, so I'd prefer not to.EDIT: Here's the query plan for Erwin Brandstetter's suggestion:
=> explain verbose select col1 from remote_tables.table1
-> order by col1 desc nulls last limit 1;
QUERY PLAN
---------------------------------------------------------------------------------------------------
Limit (cost=645521.40..645521.40 rows=1 width=4)
Output: url
-> Sort (cost=645521.40..685455.50 rows=15973640 width=4)
Output: col1
Sort Key: table1.col1
-> Foreign Scan on remote_tables.table1 (cost=100.00..565653.20 rows=15973640 width=4)
Output: col1
Remote SQL: SELECT col1 FROM public.table1This is better, in that it fetches only `c
Solution
For the time being, it seems that the best available option is to create a view on the remote server that encapsulates the query needing to be "pushed down".
on the remote, and a
the existing pushdown optimization will send the WHERE constraint to the remote, and the remote will scan only one table (making use of indexes if possible) and send back a single-row result.
postgres_fdw is happy to define and use foreign tables backed by views on the remote, and regular old query optimization within the view does the Right Thing. For instance, givenCREATE VIEW id_ranges AS
SELECT 'url_strings'::text AS tbl,
min(url_strings.id)::bigint AS lo,
max(url_strings.id)::bigint AS hi
FROM url_strings
UNION
SELECT 'captured_pages'::text AS tbl,
min(captured_pages.url)::bigint AS lo,
max(captured_pages.url)::bigint AS hi
FROM captured_pages
UNION
-- ... several more like that ...on the remote, and a
FOREIGN TABLE of the same name on the local server,SELECT lo, hi FROM id_ranges WHERE tbl = 'url_strings';the existing pushdown optimization will send the WHERE constraint to the remote, and the remote will scan only one table (making use of indexes if possible) and send back a single-row result.
Code Snippets
CREATE VIEW id_ranges AS
SELECT 'url_strings'::text AS tbl,
min(url_strings.id)::bigint AS lo,
max(url_strings.id)::bigint AS hi
FROM url_strings
UNION
SELECT 'captured_pages'::text AS tbl,
min(captured_pages.url)::bigint AS lo,
max(captured_pages.url)::bigint AS hi
FROM captured_pages
UNION
-- ... several more like that ...SELECT lo, hi FROM id_ranges WHERE tbl = 'url_strings';Context
StackExchange Database Administrators Q#77306, answer score: 2
Revisions (0)
No revisions yet.