HiveBrain v1.2.0
Get Started
← Back to all entries
debugsqlMinor

Why does my query end up with two seeks instead of one and how do I fix that?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fixwhyseekshowwithqueryinsteadtwoonethat

Problem

I have these two tables:

CREATE TABLE [TaskItems] (
    [ItemId]             UNIQUEIDENTIFIER NOT NULL UNIQUE PRIMARY KEY,
    [LastOperationTime]  DATETIME NULL DEFAULT GETUTCDATE(),
    [OwnerId]            UNIQUEIDENTIFIER NOT NULL
    -- more columns which are irrelevant
)

CREATE INDEX [TaskItemsByOwnerIdAndLastOpTimeIndex]
     ON [TaskItems](OwnerId, LastOperationTime)

CREATE TABLE [TaskOwners] (
    [OwnerId]         UNIQUEIDENTIFIER UNIQUE NOT NULL DEFAULT NEWID() UNIQUE PRIMARY KEY
    -- more columns which are irrelevant
)


and this query:

SELECT
  COUNT( OwnerId ),
  SUM(CASE WHEN LastOperationTime > DATEADD(day, -3, GETUTCDATE()) THEN 1 ELSE 0 END), 
  SUM(CASE WHEN LastOperationTime > DATEADD(day, -5, GETUTCDATE()) THEN 1 ELSE 0 END)
FROM 
(SELECT
  O.OwnerId,
  MAX( LastOperationTime) as LastOperationTime
    FROM
      TaskOwners AS O 
      LEFT OUTER JOIN TaskItems AS TI ON O.OwnerId = TI.OwnerId
  GROUP BY O.OwnerId) AS Whatever


Note that LastOperationTime is declared NULL which is a design error but fixing it would require recreating the index which is kind of problematic in production database. Anyway at any given moment there're no items which have LastOperationTime set to null.

TaskOwner table contains about 25 thousand rows. TaskItems contains about ten million rows and some rare "owners" have a million or so "items" owned while most of the "owners" own no "items" or some of them own hundreds or thousand "items".

So I run the query and request the actual execution plan and it has two index seeks (the two tree items in the lower right).

One of those has "object" TaskItemsByOwnerIdAndLastOpTimeIndex, "predicate" TaskObjects.LastOperationTime IS NULL and takes about 49% time. The other one has "object" TaskItemsByOwnerIdAndLastOpTimeIndex, "predicate" TaskObjects.LastOperationTime IS NOT NULL and takes 47% time. Both have "seek predicates" TaskObjects.OwnerId==TaskOwners.OwnerId and "backwards" directio

Solution

Try:

LEFT OUTER JOIN TaskItems AS TI ON O.OwnerId = TI.OwnerId
AND LastOperationTime IS NOT NULL


The engine could be grabbing the Null row separately, as it's a slightly different scenario to not finding one. But as you essentially want the same behaviour, just out the explicit filter in there, within the ON clause of your LEFT JOIN. Then the only NULLs will be as a result of the outer join.

Code Snippets

LEFT OUTER JOIN TaskItems AS TI ON O.OwnerId = TI.OwnerId
AND LastOperationTime IS NOT NULL

Context

StackExchange Database Administrators Q#75072, answer score: 3

Revisions (0)

No revisions yet.