HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to teach the optimizer to use index instead of fts with join on data logging master/detail tables?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
loggingtablesthedetailhowftswithjoininsteadmaster

Problem

SQL Server 2012 optimizer does not get it right.

Test case, summary:

This is a simplified test scenario. DDL statements at the bottom.

I have two tables for data logging, A and B. There is a 1:n relationship - A has header records with a datetime called a_timeand B has detail records, with a field B.akeyare referencing A.id, and fields name and (data).

A has approx. 25,000,000 records, B has aprox. 500,000,000 records. B has roughly 200 records referencing
each record in A. One A and approx. 200 B records are inserted at a time together every
five minutes, reflected by A.a_time.

Clustered indexes are the primary keys, id, type int identity.

B has one non-clustered index, named IX_B_akey, on B.akey.

A.a_time is (non-clustered) indexed, too.

Now this query:

SELECT A.a_time, B.*
 FROM B
   join A on B.akey = A.id
 where  
    A.a_time > '2017-01-13T01:30:00' and A.a_time < '2017-01-14T07:30:00'
       and B.name in ('name33', 'name55', 'name66')


takes about 3 minutes on my database server. Execution plan: here (see below for more exact execution plans)

When I add a simple hint to use
IX_B_akey:

SELECT A.a_time, B.*
 FROM B
   with (index(IX_B_akey))
   join A on B.akey = A.id
 where  
    A.a_time > '2017-01-13T01:30:00' and A.a_time < '2017-01-14T07:30:00'
       and B.name in ('name33', 'name55', 'name66')


it runs in less than one second. Execution plan:here (see below for more exact execution plans)

This does not change when I manually update statistics on both tables.

The query plan for the query without the hint shows the server will do a table
scan on B, looking for matching names. It is no surprise that this will take a while. With the hint, it uses the index and does a lookup via index for the
B records referencing matching A records. This is much faster.

I do not want to put query optimizer code into my software. Also, I use NHibernate.
Although it is possible, it would be

Solution

Bad plans come from hard choices. Instead of making the optimizer choose between a plan with two nested loops joins and a plan with a big parallel hash join, you could reorganize B to optimize for the access path from A to B.

The best index here is probably to make the clustered PK of B (akey,id). Then there would be only one nested loops join in the already-faster plan, making it obviously better than the parallel hash join plan.

Context

StackExchange Database Administrators Q#214364, answer score: 2

Revisions (0)

No revisions yet.