patternsqlMinor
Will Postgres perform the computation of calculated columns that aren't selected in a view?
Viewed 0 times
thecomputationpostgrescolumnsviewarenwillperformthatselected
Problem
I'm trying to understand the performance impact of selecting data from a view, where one of the columns in a view is function of other data in the original table.
Does the computation get performed irrespective of whether or not the computed column is in the list of selected columns?
If I had a table and the view declared like so
Would that multiplication be performed in a query like the one below?
Is there a reference that guarantees this one way or the other? I was reading the documentation on the rule system in Postgres, but I think the answer really lies with the optimiser, since nothing in the rule system documentation indicated that it wouldn't be selected.
I suspect in the above case the computation isn't performed. I changed the view to use division instead of multiplication, and inserted a
Is there any way to understand what computations are being done when a
Does the computation get performed irrespective of whether or not the computed column is in the list of selected columns?
If I had a table and the view declared like so
CREATE TABLE price_data (
ticker text, -- Ticker of the stock
ddate date, -- Date for this price
price float8, -- Closing price on this date
factor float8 -- Factor to convert this price to USD
);
CREATE VIEW prices AS
SELECT ticker,
ddate,
price,
factor,
price * factor as price_usd
FROM price_data
Would that multiplication be performed in a query like the one below?
select ticker, ddate, price, factor from pricesIs there a reference that guarantees this one way or the other? I was reading the documentation on the rule system in Postgres, but I think the answer really lies with the optimiser, since nothing in the rule system documentation indicated that it wouldn't be selected.
I suspect in the above case the computation isn't performed. I changed the view to use division instead of multiplication, and inserted a
0 for factor into price_data. The query above didn't fail, but if the query was modified to select the computed column the modified query failed.Is there any way to understand what computations are being done when a
select is carried out? I guess I'm looking for something like EXPLAIN but which also tells me about the computation(s) that are being performed.Solution
As @Laurenz said, your analysis is correct: the optimiser will avoid evaluating column expressions which do not affect the outcome of the query (and your attempt to force a divide-by-zero error is proof of this).
This depends on which columns you're selecting, but it also depends on the volatility category of the column expressions. The optimiser is free to omit
For example:
If only the
...then as you can see,
However, if the
...then we see both column expressions appearing in the plan, and the
There doesn't seem to be any explicit mention of this behaviour in the docs, so there are no hard guarantees as to whether or not an expression will be evaluated, and you should not rely on any side-effects your function calls might have.
But if your only concern is performance, then as long as you mark your functions as
(And while you're there auditing your volatility declarations, you might want to set the parallel safety flags as well.)
This depends on which columns you're selecting, but it also depends on the volatility category of the column expressions. The optimiser is free to omit
immutable and stable function calls if their output is never used, as they cannot affect the outcome, but volatile functions might have side-effects, so they are not so readily optimised away.For example:
create function stable_function() returns int as $
begin
raise notice 'stable_function() called';
return 1;
end
$
language plpgsql stable;
create function volatile_function() returns int as $
begin
raise notice 'volatile_function() called';
return 1;
end
$
language plpgsql volatile;
create view v as
select stable_function(), volatile_function();If only the
volatile column is selected:test=# explain (analyse, verbose) select volatile_function from v;
NOTICE: volatile_function() called
QUERY PLAN
------------------------------------------------------------------------------------------------
Subquery Scan on v (cost=0.00..0.27 rows=1 width=4) (actual time=0.057..0.057 rows=1 loops=1)
Output: v.volatile_function
-> Result (cost=0.00..0.26 rows=1 width=8) (actual time=0.056..0.056 rows=1 loops=1)
Output: NULL::integer, volatile_function()...then as you can see,
stable_function() is absent from the explain output, and the lack of a NOTICE confirms that this call has been optimised away.However, if the
stable column is selected instead:test=# explain (analyse, verbose) select stable_function from v;
NOTICE: stable_function() called
NOTICE: volatile_function() called
QUERY PLAN
------------------------------------------------------------------------------------------------
Subquery Scan on v (cost=0.00..0.52 rows=1 width=4) (actual time=0.139..0.139 rows=1 loops=1)
Output: v.stable_function
-> Result (cost=0.00..0.51 rows=1 width=8) (actual time=0.138..0.138 rows=1 loops=1)
Output: stable_function(), volatile_function()...then we see both column expressions appearing in the plan, and the
NOTICEs show that both functions have been executed.There doesn't seem to be any explicit mention of this behaviour in the docs, so there are no hard guarantees as to whether or not an expression will be evaluated, and you should not rely on any side-effects your function calls might have.
But if your only concern is performance, then as long as you mark your functions as
stable or immutable where appropriate, you can be reasonably sure (especially in simple cases like this) that they will not be evaluated unless they're needed.(And while you're there auditing your volatility declarations, you might want to set the parallel safety flags as well.)
Code Snippets
create function stable_function() returns int as $$
begin
raise notice 'stable_function() called';
return 1;
end
$$
language plpgsql stable;
create function volatile_function() returns int as $$
begin
raise notice 'volatile_function() called';
return 1;
end
$$
language plpgsql volatile;
create view v as
select stable_function(), volatile_function();test=# explain (analyse, verbose) select volatile_function from v;
NOTICE: volatile_function() called
QUERY PLAN
------------------------------------------------------------------------------------------------
Subquery Scan on v (cost=0.00..0.27 rows=1 width=4) (actual time=0.057..0.057 rows=1 loops=1)
Output: v.volatile_function
-> Result (cost=0.00..0.26 rows=1 width=8) (actual time=0.056..0.056 rows=1 loops=1)
Output: NULL::integer, volatile_function()test=# explain (analyse, verbose) select stable_function from v;
NOTICE: stable_function() called
NOTICE: volatile_function() called
QUERY PLAN
------------------------------------------------------------------------------------------------
Subquery Scan on v (cost=0.00..0.52 rows=1 width=4) (actual time=0.139..0.139 rows=1 loops=1)
Output: v.stable_function
-> Result (cost=0.00..0.51 rows=1 width=8) (actual time=0.138..0.138 rows=1 loops=1)
Output: stable_function(), volatile_function()Context
StackExchange Database Administrators Q#240503, answer score: 6
Revisions (0)
No revisions yet.