HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Effect on execution plans when a table variable has a primary key

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
effectprimarytableexecutionhaswhenvariableplanskey

Problem

Having read a great deal about the differences between temporary tables and table variables in SQL Server, I am experimenting with switching from mostly using temporary tables to mostly using table variables. (They seem to be a better fit for the types of queries I usually work with.)

In these queries, the tables hold unique identifiers that drive the lookup process. It's been my habit, when working with temporary tables, to include a PRIMARY KEY constraint so that the query optimizer is aware that it won't see any duplicates. However, given that the optimizer (in most circumstances, and for my queries) assumes that a table variable only holds a single row*, which is unique by definition, is the query optimizer going to make choices any differently if there's a PRIMARY KEY constraint?

* Technically, it assumes there are no rows, but replaces the zero with a one. (Because the zero interacts very poorly with the rest of the estimation process.) But it also depends on whether the table variable is populated or not when the query is compiled. There is some background information here: What's the difference between a temp table and table variable in SQL Server?.

I'm currently using SQL Server 2014, but I would be curious if the behavior changes in newer versions.

As has been pointed out, a PRIMARY KEY constraint comes with a clustered index that gives the query optimizer more choices on how to get data out of the table variable. I was aware of this and thinking about the rest of the query plan. But after attempting to clarify my question, I've decided that the question I was attempting to ask was too broad and probably particular to my extreme situation. (Nothing but navigational-type queries into half-a-trillion-row tables with an expectation of sub-second performance.) So I am going to leave my question as-is.

Solution

...is the query optimizer going to make choices any differently if there's a PRIMARY KEY constraint?

Yes it might do. Estimating one row (with the understanding that estimates can be incorrect) is different from knowing that the table contains only unique values. Certain plan space explorations require a key, for example.

A good general rule of thumb is to provide as much information about the data and query task as you can to the optimizer. If there is a key, say so explicitly. It's not as if declaring the key will add much cost (beyond a little keyboard work) in most cases.

I personally rarely use table variables. The lack of statistics (including distribution and density) and cardinality information (all separate considerations) provides less information to the optimizer than an equivalent temporary table. My experience has very much been that table variable plans do not adapt as well to changing circumstances over time.

I only use a table variable when there are special reasons to be sure that it will always be adequate from a query optimization point of view. Only you have enough information about your databases and queries to say whether that is true in your case or not.

The question is rather broad (without a specific example), and so is this answer.

Context

StackExchange Database Administrators Q#164463, answer score: 8

Revisions (0)

No revisions yet.