HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

OPTION FORCE ORDER improves performance until rows are deleted

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rowsorderforceuntilareoptiondeletedperformanceimproves

Problem

I have a somewhat complex SQL Server 2008 query (about 200 lines of fairly dense SQL) that wasn't performing as I needed it. Over time, performance dropped from about .5 seconds to about 2 seconds.

Taking a look at the execution plan, it was pretty obvious that by reordering the joins, performance could be improved. I did, and it did... down to about .3 seconds. Now the query has the "OPTION FORCE ORDER" hint, and life is good.

Along comes me today, cleaning up the database. I archive about 20% of the rows, taking no action in the relevant database except deleting rows... the execution plan gets TOTALLY hosed. It completely misjudges how many rows certain subtrees will return, and (for example) replaces a:



with



Now the query time spikes from about .3s to about 18s. (!) Just because I deleted rows. If I remove the query hint I'm back to about 2s query time. Better, but worse.

I've reproduced the issue after restoring the database to multiple locations and servers. Simply deleting about 20% of rows from each table always causes this issue.

  • Is this normal for a forced join order to make the query estimates to be


completely inaccurate (and thus query times unpredictable)?

  • Should I just expect that I'll have to


either accept sub-optimal query performance, or watch it like a hawk
and frequently manually edit query hints? Or maybe hint every join as well?
.3s to 2s is a big hit to take.

  • Is it obvious why the


optimizer blew up after deleting rows? For example, "yes, it took a sample
scan, and because I archived most of the rows earlier in the data
history the sample yielded sparse results, so it underestimated the need for
a sorted hash operation"?

If you'd like to see execution plans, please suggest a location I can post them. Otherwise, I've sampled the most stunning bit. Here's the fundamental mis-estimate, numbers in parens are (estimated:actual) rows.

```
/ Clustered Index Scan (908:7229)
Nested Loops (Inner Join) --<

Solution

Is this normal for a forced join order to make the query estimates to
be completely inaccurate (and thus query times unpredictable)?

The use of FORCE ORDER isn't making estimates inaccurate, the deletion of rows did. Forcing an update of statistics on the table may improve the estimation accuracy.

Should I just expect that I'll have to either accept sub-optimal query
performance, or watch it like a hawk and frequently manually edit
query hints? Or maybe hint every join as well? .3s to 2s is a big hit
to take.

Preferable would be to ensure the optimiser is given the information it needs to generate the best plan, without using the FORCE ORDER hint. By doing so, it should cope better with changes to the underlying data distribution without requiring manual intervention. That said, if the nature of the data is such that cardinality could vary significantly hour by hour or day by hour, consider using a plan guide to ensure the plan is fixed.

Is it obvious why the optimizer blew up after deleting rows? For
example, "yes, it took a sample scan, and because I archived most of
the rows earlier in the data history the sample yielded sparse
results, so it underestimated the need for a sorted hash operation"?

You didn't mention the row counts in the problem tables but it's likely that the deletions either:

  • didn't remove enough rows to to trigger a statistics update. This should occur when 20% of rows have been modified but there is the option to use trace flag 2371 to enable a dynamic threshold.



  • did trigger a statistics update but the sample gathered wasn't representative. Correct this by running a manual update WITH FULLSCAN.



You could also be running into good old fashioned parameter sniffing problems, for which there are myriad options to work around. WITH RECOMPILE might be an expensive option to specify with a query this large but it's worth investigating at both procedure and statement level.

Context

StackExchange Database Administrators Q#22337, answer score: 7

Revisions (0)

No revisions yet.