HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

T-SQL Optimization

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
sqloptimizationstackoverflow

Problem

If the answer is 'No', I'm OK with that...

I'm looking to see if this can be optimized any...it's part of a much larger stored proc. CGCode is varchar(50), Year and Month are smallint, FEIN is char(9)

select max(id)
from Table
where 1=1
    and cgcode = 123
    and datefromparts(cast(year as char(4)),cast(month as char(2)),'01')  0 
group by cgcode, year, month, fein


Logical reads from actual execution plan: 1,566,473

Source table raw data 32 million+ records

Estimated rows: 640K, actual 55K, before the Group By kicks in

Implicit conversion warnings on Year/Month/CGCode (comes in as bigint)

Execution time roughly 7.5 seconds, doing a Non-Clustered Index Seek:

End result set is 114 rows (for this CGCode we test with...others vary)

Performance is roughly the same in Prod on significantly better hardware than the Dev box. This is only going to get worse over time, as its pulling everything older than the current month, to populate a history chart in a UI.

What other info can I provide?

Paste the plan

The index being used currently:

CREATE NONCLUSTERED INDEX [COIX_Table_TotalCount] ON [dbo].[Table]
(
    [TotalCount] ASC
)
INCLUDE (   [ID],
    [CGCode],
    [Year],
    [Month],
    [FEIN]) 
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, 
        ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO


CGCode limits the data most...281 values. year is only 3-4 years worth of data, month only 12 options there of course. TotalCount is a bigint, 17K distinct values across 32 million records. No idea what the purpose of the TotalCount column is.

Solution

First of all, I would rewrite this condition. As it is, it can't use any index effectively.:

datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate()


Rewrite it so there is no functions or calculations applied to the columns:

( year < year(getdate()) 
or year = year(getdate()) and month <= month(getdate())
 )


The rewrite would allow an index on (cgcode, year, month) to be used - or even better a (cgcode, year, month, fein) one, considering the GROUP BY parts.

Next, if totalcount is always part of the WHERE clause or thee specific totalcount > 0 condition is always used, you might squeeze even more efficiency by adding it to the INCLUDE part or with a more targeted, filtered index:

-- option A
index (cgcode, year, month, fein, id) 
      include (totalcount)

-- option B
index (cgcode, year, month, fein, id)
      where (totalcount > 0)

Code Snippets

datefromparts(cast(year as char(4)),cast(month as char(2)),'01') < getdate()
( year < year(getdate()) 
or year = year(getdate()) and month <= month(getdate())
 )
-- option A
index (cgcode, year, month, fein, id) 
      include (totalcount)

-- option B
index (cgcode, year, month, fein, id)
      where (totalcount > 0)

Context

StackExchange Database Administrators Q#176296, answer score: 8

Revisions (0)

No revisions yet.