HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Performance issue with TOP (1) on a nullable column

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
topcolumnwithissuenullableperformance

Problem

I need help to identify the reason for a performance issue.

Actually following query sits in a scalar-valued-function but that is not the reason for the issue since it needs the same time as query:

SELECT TOP (1) dd.gsp_part_price 
FROM  tabdata 
       INNER JOIN locsl 
               ON tabdata.fisl = locsl.idsl 
       INNER JOIN locgsp 
               ON locsl.figsp = locgsp.idgsp 
       INNER JOIN tabdatadetail dd
               ON dd.fidata = tabdata.iddata 
       INNER JOIN tabsparepart sp 
               ON dd.fisparepart = sp.idsparepart 
WHERE  sp.sparepartname = '1270-3132' 
       AND dd.gsp_part_price IS NOT NULL 
ORDER BY Claim_Submitted_Date DESC


This query returns the price immediately for most spareparts but some parts need several minutes to return NULL. So this query returns 940 rows immediately for the same part as above where every gsp_part_price is NULL which is the reason for the NULL above:

SELECT dd.*, sp.* 
FROM tabdatadetail dd 
INNER JOIN tabsparepart sp 
    on dd.fisparepart=sp.idsparepart
WHERE sp.sparepartname='1270-3132'


Why does the first query needs 3 minutes to return NULL whereas the query below returns immediately the 940 rows? I must admit that i've still problems to interpret execution plans.

Note that the first query also returns immediately if i remove the IS NOT NULL CHECK:

dd.gsp_part_price IS NOT NULL


All involved tables are linked via foreign-keys, i have tried to add an index on gsp_part_price which had no effect.

Here's the query execution plan:

Edit: Here are the stats on the PK and FK (to parent table tabData) indexes:

```
DBCC SHOW_STATISTICS('dbo.tabDataDetail','PK_tabDataDetail') WITH STAT_HEADER;
PK_tabDataDetail Nov 6 2013 10:04AM 64327191 387089 183 1 8 NO

DBCC SHOW_STATISTICS('dbo.tabDataDetail','IX_fiData') WITH STAT_HEADER;
IX_fiData Nov 6 2013 10:04AM 64327191 378150 198 0,2441889 4 NO

DBCC SHOW_STATISTICS('dbo.tabDataDetai

Solution

The query takes a lot of time because it finds no rows.

The plan first is setup top pick one row from tabSparePart where sparepartname = '1270-3132'. It then moves along to the index scan of tabData. For each row fetched from tabData there is a clustered index seek in locSL followed by a clustered index seek in tabDataDetail which also checks for not null on gsp_part_price and finally returns the values of fiSparePart and GSP_Part_Price. fiSparePart is then compared against the idSparePart returned from the Index Seek in tabSparePart. If they are the same the query immediately terminates and return the gsp_part_price. If they are not the same the index scan in tabData moves to the next record and all the checks are executed again. That keeps happening until there is a row returned or all rows in tabData have been processed.

As it turns out you get a totally different query plan when you do top(1000).

You can use the optimize for hint to make SQL Server think you actually want 1000 rows when you only need one.

Your scalar valued function would then look something like this:

CREATE FUNCTION [dbo].[FunctionName]()
RETURNS INT
AS
BEGIN

  DECLARE @TOP INT
  DECLARE @RET INT

  SET @TOP = 1

  SELECT TOP (@TOP) @RET = dd.gsp_part_price 
  FROM  tabdata 
          INNER JOIN locsl 
                  ON tabdata.fisl = locsl.idsl 
          INNER JOIN locgsp 
                  ON locsl.figsp = locgsp.idgsp 
          INNER JOIN tabdatadetail dd
                  ON dd.fidata = tabdata.iddata 
          INNER JOIN tabsparepart sp 
                  ON dd.fisparepart = sp.idsparepart 
  WHERE  sp.sparepartname = '1270-3132' 
          AND dd.gsp_part_price IS NOT NULL 
  ORDER BY Claim_Submitted_Date DESC 
  OPTION (OPTIMIZE FOR(@TOP = 1000))

  RETURN @RET

END

Code Snippets

CREATE FUNCTION [dbo].[FunctionName]()
RETURNS INT
AS
BEGIN

  DECLARE @TOP INT
  DECLARE @RET INT

  SET @TOP = 1

  SELECT TOP (@TOP) @RET = dd.gsp_part_price 
  FROM  tabdata 
          INNER JOIN locsl 
                  ON tabdata.fisl = locsl.idsl 
          INNER JOIN locgsp 
                  ON locsl.figsp = locgsp.idgsp 
          INNER JOIN tabdatadetail dd
                  ON dd.fidata = tabdata.iddata 
          INNER JOIN tabsparepart sp 
                  ON dd.fisparepart = sp.idsparepart 
  WHERE  sp.sparepartname = '1270-3132' 
          AND dd.gsp_part_price IS NOT NULL 
  ORDER BY Claim_Submitted_Date DESC 
  OPTION (OPTIMIZE FOR(@TOP = 1000))

  RETURN @RET

END

Context

StackExchange Database Administrators Q#54327, answer score: 4

Revisions (0)

No revisions yet.