HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

SQL Calculated Field in both SELECT and GROUP BY clause

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fieldgroupsqlbothandselectclausecalculated

Problem

Often in querying my MS SQL Server databases, I need to create a calculated field, such as this

(CASE WHEN A.type = 'Workover' THEN 'Workover' 
      ELSE (CASE WHEN substring(C.category, 2, 1) = 'D' THEN 'Drilling' 
                 WHEN substring(C.category, 2, 1) = 'C' THEN 'Completion' 
                 WHEN substring(C.category, 2, 1) = 'W' THEN 'Workover' 
                 ELSE 'Other' 
            END)
END)


and then I need to group my results by this calculated field (among others). Hence, I have the same calculation in both the SELECT and GROUP BY clauses. Is SQL server actually performing these calculations twice, or is it smart enough to only do it once?

Solution

I have the same calculation in both the SELECT and GROUP BY clauses. Is SQL server actually performing these calculations twice, or is it smart enough to only do it once?

The simple answer is that SQL Server makes no general guarantees about when, and how many times, a scalar expression will be evaluated at execution time.

There are all sorts of complicated (and undocumented) behaviours within the optimizer and execution engine regarding the placement, execution, and caching of scalar expressions. Books Online doesn't have much to say about this, but what it does say is this:

This describes one of the behaviours I alluded to before, deferred execution of expressions. I wrote about some of the other current behaviours (which could change any time) in this blog post.

Another consideration is that the cost model used by the query optimizer doesn't currently do much in the way of cost estimation for scalar expressions. Without a robust costing framework, current results are based on broad heuristics or pure chance.

For very simple expressions, it probably doesn't make much difference whether the expression is evaluated once or many times in most cases. That said, I have encountered large queries where performance has been adversely effected when the expression is redundantly evaluated a very large number of times, or the evaluation occurs on a single thread where it would have been advantageous to evaluate in a parallel branch of the execution plan.

In summary, the current behaviour is undefined, and there is nothing much in execution plans to help you figure out what happened (and it will not always be convenient to attach a debugger to examine the detailed engine behaviours, as in the blog post).

If you encounter cases where scalar evaluation issues matter to performance, raise the issue with Microsoft Support. This is the best way to provide feedback to improve future versions of the product.

Context

StackExchange Database Administrators Q#58442, answer score: 15

Revisions (0)

No revisions yet.