patternsqlMajor
Does SQL Server Only Perform Calculations In A SELECT List Once?
Viewed 0 times
oncesqlcalculationsperformdoesserverselectlistonly
Problem
Take the following example:
Would CalculationA and CalculationB, each be calculated twice?
Or would the optimizer be clever enough to calculate them once and use the result twice?
I would like to perform a test to see the result for myself, however, I am not sure how I could check something like this.
My assumption is that it would perform the calculation twice.
In which case, depending upon the calculations involved, might it be better to use a derived table, or nested view? Consider the following:
In this case, I would hope that the calculations would only be performed once?
Please can someone confirm or refute my assumptions?
Or instruct me on how to test something like this for myself?
Thanks.
SELECT As ColA,
As ColB,
+ As ColC
FROM TableAWould CalculationA and CalculationB, each be calculated twice?
Or would the optimizer be clever enough to calculate them once and use the result twice?
I would like to perform a test to see the result for myself, however, I am not sure how I could check something like this.
My assumption is that it would perform the calculation twice.
In which case, depending upon the calculations involved, might it be better to use a derived table, or nested view? Consider the following:
SELECT TableB.ColA,
TableB.ColB,
TableB.ColA + TableB.ColB AS ColC,
FROM(
SELECT As ColA,
As ColB
FROM TableA
) As TableBIn this case, I would hope that the calculations would only be performed once?
Please can someone confirm or refute my assumptions?
Or instruct me on how to test something like this for myself?
Thanks.
Solution
Most of the information you will need is going to be in the execution plan ( and the plan XML).
Take this query:
The execution plan (opened with sentryone plan explorer) shows what steps it went through:
With the stream aggregate aggregating the values for EXPR1005 & EXPR1006
If we want to know what these are, we could get the exact info on these expressions from the query plan XML:
With the first compute scalar computing
And the last compute scalar being a simple addition:
This is reading it as the data flows, in theory you should be reading it from left to right if going over the logical execution.
In that case,
Would CalculationA and CalculationB, each be calculated twice? Or
would the optimizer be clever enough to calculate them once and use
the result twice?
Previous tests show that in this case
As a result,
Grouping by 200 distinct values
If we are grouping by 200 distinct values (val3) the same is shown:
Aggregating down to these 200 distinct values in
performing the sums on val & val2 and then adding them together for ColC:
Even if we are grouping on all but one non-unique value, the same addition should be seen for the compute scalar.
Adding a function to ColA & ColB
Even if we change the query to this:
The aggregations are still not going to be calculated twice, we are simply adding the
Ofcourse, running
If you want to go deeper in when this happens I would nudge you towards Query Optimizer Deep Dive - Part 1 (until Part 4) by Paul White.
Anther way to dive deeper into query execution phases is by adding these hints:
This will expose the input tree as created by the optimizer.
The adding of the two previous calculated values to get
This information is already present in the Input Tree, even before the simplification phase has taken place, showing that the optimizer immediately knows that it does not have to perform the same calculation twice.
Take this query:
SELECT COUNT(val) As ColA,
COUNT(val2) As ColB,
COUNT(val) + COUNT(val2) As ColC
FROM dbo.TableA;The execution plan (opened with sentryone plan explorer) shows what steps it went through:
With the stream aggregate aggregating the values for EXPR1005 & EXPR1006
If we want to know what these are, we could get the exact info on these expressions from the query plan XML:
With the first compute scalar computing
ColA & ColB:And the last compute scalar being a simple addition:
This is reading it as the data flows, in theory you should be reading it from left to right if going over the logical execution.
In that case,
EXPR1004 is calling the other expressions, EXPR1002 & EXPR1003. In turn these are calling EXPR1005 & EXPR1006.Would CalculationA and CalculationB, each be calculated twice? Or
would the optimizer be clever enough to calculate them once and use
the result twice?
Previous tests show that in this case
ColC is simplified as an addition of the calculations that are defined as ColA & ColB. As a result,
ColA & ColB are only calculated once.Grouping by 200 distinct values
If we are grouping by 200 distinct values (val3) the same is shown:
SET STATISTICS IO, TIME ON;
SELECT SUM(val) As ColA,
SUM(val2) As ColB,
SUM(val) + SUM(val2) As ColC
FROM dbo.TableA
GROUP BY val3;Aggregating down to these 200 distinct values in
val3performing the sums on val & val2 and then adding them together for ColC:
Even if we are grouping on all but one non-unique value, the same addition should be seen for the compute scalar.
Adding a function to ColA & ColB
Even if we change the query to this:
SET STATISTICS IO, TIME ON;
SELECT ABS(SUM(val)) As ColA,
ABS(SUM(val2)) As ColB,
SUM(val) + SUM(val2) As ColC
FROM dbo.TableAThe aggregations are still not going to be calculated twice, we are simply adding the
ABS() function to the resultset of the aggregation, wich is one row:Ofcourse, running
SUM(ABS(ColA) & SUM(ABS(ColB)) will make the optimizer unable to use the same expression for calculating ColC.If you want to go deeper in when this happens I would nudge you towards Query Optimizer Deep Dive - Part 1 (until Part 4) by Paul White.
Anther way to dive deeper into query execution phases is by adding these hints:
OPTION
(
RECOMPILE,
QUERYTRACEON 3604,
QUERYTRACEON 8605
);This will expose the input tree as created by the optimizer.
The adding of the two previous calculated values to get
ColC is then translated to:AncOp_PrjEl COL: Expr1004
ScaOp_Arithmetic x_aopAdd
ScaOp_Identifier COL: Expr1002
ScaOp_Identifier COL: Expr1003This information is already present in the Input Tree, even before the simplification phase has taken place, showing that the optimizer immediately knows that it does not have to perform the same calculation twice.
Code Snippets
SELECT COUNT(val) As ColA,
COUNT(val2) As ColB,
COUNT(val) + COUNT(val2) As ColC
FROM dbo.TableA;<ColumnReference Column="Expr1005" />
<ScalarOperator ScalarString="COUNT([Database].[dbo].[TableA].[val])">
<Aggregate AggType="COUNT_BIG" Distinct="false">SET STATISTICS IO, TIME ON;
SELECT SUM(val) As ColA,
SUM(val2) As ColB,
SUM(val) + SUM(val2) As ColC
FROM dbo.TableA
GROUP BY val3;SET STATISTICS IO, TIME ON;
SELECT ABS(SUM(val)) As ColA,
ABS(SUM(val2)) As ColB,
SUM(val) + SUM(val2) As ColC
FROM dbo.TableAOPTION
(
RECOMPILE,
QUERYTRACEON 3604,
QUERYTRACEON 8605
);Context
StackExchange Database Administrators Q#252661, answer score: 41
Revisions (0)
No revisions yet.