HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Inline Select Causing Incredibly slow query performance

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
queryslowcausingperformanceinlineselectincredibly

Problem

I have a query that is used to populate an aggregate table for reporting purposes. The query comes from another developer here at the company I work for but is my job to make it run fast. So far all of my early attempts failed. I've tried several things with this query and this is where I am at the moment. I have shaved about a half an hour of it's load time already but am stuck and thinking I will probably just have to re-do the whole thing. I'm hoping someone on here can see what if I'm missing anything and give me some pointers on how to fix this query.

```
SELECT P.CompanyID,
P.CompanyName,
P.StoreID,
P.StoreName,
P.ReportDate,
Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
FROM FactSalesTransaction AS FT
LEFT JOIN (SELECT TransactionID,
DimStoreID,
DimBusinessDateID,
Sum(PaymentAmount) AS PaymentAmount
FROM FactSalesPayment
WHERE DimPaymentTypeID <> 2
AND ModStatusFlg <> 'D'
GROUP BY TransactionID,
DimStoreID,
DimBusinessDateID) AS FP
ON FP.TransactionID = FT.TransactionID
AND FP.DimStoreID = FT.DimStoreID
INNER JOIN DimCalendar AS C
ON FT.DimBusinessDateID = C.DimCalendarID
AND FP.DimBusinessDateID = C.DimCalendarID AND C.CalendarDate >= '12/4/2012'
WHERE FT.DimStoreID = P.DimStoreID
AND FT.DimBusinessDateID = P.DimBusinessDateID

AND FT.ModStatusFlg <> 'D'), 0) AS StoreCash,
SR.CashDeposit AS StoreResp,
SN.StoreNet,

Solution

Okay so this is what I actually did to make this query that used to run in about an hour and a half run in under a minute. First I did a little more digging to see exactly what this was doing. There are a few major sub selects in the outer most portion of the query. They look like this.

Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
               FROM   FactSalesTransaction AS FT
                      LEFT JOIN (SELECT TransactionID,
                                        DimStoreID,
                                        DimBusinessDateID,
                                        Sum(PaymentAmount) AS PaymentAmount
                                 FROM   FactSalesPayment
                                 WHERE  DimPaymentTypeID <> 2
                                        AND ModStatusFlg <> 'D'
                                 GROUP  BY TransactionID,
                                           DimStoreID,
                                           DimBusinessDateID) AS FP
                        ON FP.TransactionID = FT.TransactionID
                           AND FP.DimStoreID = FT.DimStoreID
                      INNER JOIN DimCalendar AS C
                        ON FT.DimBusinessDateID = C.DimCalendarID
                           AND FP.DimBusinessDateID = C.DimCalendarID  AND C.CalendarDate >= '12/4/2012'
               WHERE  FT.DimStoreID = P.DimStoreID
                      AND FT.DimBusinessDateID = P.DimBusinessDateID

                      AND FT.ModStatusFlg <> 'D'), 0) AS StoreCash


These select statements are joining two of my largest tables. FactSalesTransaction(105 Million Records) to FactSalesPayment(102 Million records) and it is doing so in a select within a select. Which essentially means that for every row returned it is executing this query. Well this query is usually run for about seven days worth of data and therefore returns about 19,000 Records. That Mean's the 3 sub selects to these massive tables need to be executed 19,000 times. Bingo I think I've found where my performance loss was. So I switched those queries to a left join. Nothing complicated just made it so they only had to join once. The left join to replace looks like this.

LEFT JOIN (SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS StoreCash, FT.DimStoreID, FT.DimBusinessDateID
               FROM   FactSalesTransaction AS FT
                      LEFT JOIN (SELECT TransactionID,
                                        DimStoreID,
                                        Sum(PaymentAmount) AS PaymentAmount
                                 FROM   FactSalesPayment
                                 WHERE  DimPaymentTypeID <> 2
                                        AND ModStatusFlg <> 'D'
                                 GROUP  BY TransactionID,
                                           DimStoreID) AS FP
                        ON FP.TransactionID = FT.TransactionID
                           AND FP.DimStoreID = FT.DimStoreID
               WHERE   FT.ModStatusFlg <> 'D'
               GROUP BY FT.DimStoreID, FT.DimBusinessDateID)  AS FP
                       ON FP.DimStoreID = P.DimStoreID
                      AND FP.DimBusinessDateID = P.DimBusinessDateID


As you can see I really didn't change much to the select itself just altered it to not run 19,000 times. The next thing I did was altered the query to a stored procedure where a take the date range the user or in this case the ETL process gives(usually 7 days back) Select from DimCalendar the DimCalendarID for that day so the query is using integers instead of date times and overall has less records to join. Making the final query look like this.

```
CREATE PROCEDURE [dbo].[NoneOfYourBusinessWhatINamedIt]
@ETLLoadDate DATETIME
AS
BEGIN
DECLARE @DimCal Int = (SELECT DimCalendarID FROM DimCalendar WHERE CalendarDate = @ETLLoadDate)
END
SELECT P.CompanyID,
P.CompanyName,
P.StoreID,
P.StoreName,
P.ReportDate,
FP.StoreCash,
SR.CashDeposit AS StoreResp,
SN.StoreNet,
P.DimEmployeeID AS EmpID,
P.EmpName,
P.RegisterID,
P.PullNumber,
ISNULL(SUM(PC.PullCash), 0) AS PullCash,
P.PullResp AS PullResp,
ISNULL(SUM(PN.PullNet),0) AS PullNet
FROM (SELECT C.CompanyID,
C.CompanyName,
S.StoreID,
S.StoreName,
F.DimEmployeeID,
E.FirstName + ' ' + E.LastName AS EmpName,
CASE
WHEN F.PullDrawerStartTime <> '1900-01-01' THEN F.PullDrawerStartTime
ELSE Isnull(Cast((SELECT TOP 1 Dateadd(SECOND, 1, PullDrawerEndTime)
FROM FactPullDrawer
WHERE PullDrawerEndTime 'D' AND DimBusinessDateID = @DimCal
GROUP BY DimStoreID,
DimBusinessDateID) AS SN
ON SN.DimStoreID =

Code Snippets

Isnull((SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS PullNet
               FROM   FactSalesTransaction AS FT
                      LEFT JOIN (SELECT TransactionID,
                                        DimStoreID,
                                        DimBusinessDateID,
                                        Sum(PaymentAmount) AS PaymentAmount
                                 FROM   FactSalesPayment
                                 WHERE  DimPaymentTypeID <> 2
                                        AND ModStatusFlg <> 'D'
                                 GROUP  BY TransactionID,
                                           DimStoreID,
                                           DimBusinessDateID) AS FP
                        ON FP.TransactionID = FT.TransactionID
                           AND FP.DimStoreID = FT.DimStoreID
                      INNER JOIN DimCalendar AS C
                        ON FT.DimBusinessDateID = C.DimCalendarID
                           AND FP.DimBusinessDateID = C.DimCalendarID  AND C.CalendarDate >= '12/4/2012'
               WHERE  FT.DimStoreID = P.DimStoreID
                      AND FT.DimBusinessDateID = P.DimBusinessDateID

                      AND FT.ModStatusFlg <> 'D'), 0) AS StoreCash
LEFT JOIN (SELECT Sum(FT.GrossSales - Isnull(FP.PaymentAmount, 0)) AS StoreCash, FT.DimStoreID, FT.DimBusinessDateID
               FROM   FactSalesTransaction AS FT
                      LEFT JOIN (SELECT TransactionID,
                                        DimStoreID,
                                        Sum(PaymentAmount) AS PaymentAmount
                                 FROM   FactSalesPayment
                                 WHERE  DimPaymentTypeID <> 2
                                        AND ModStatusFlg <> 'D'
                                 GROUP  BY TransactionID,
                                           DimStoreID) AS FP
                        ON FP.TransactionID = FT.TransactionID
                           AND FP.DimStoreID = FT.DimStoreID
               WHERE   FT.ModStatusFlg <> 'D'
               GROUP BY FT.DimStoreID, FT.DimBusinessDateID)  AS FP
                       ON FP.DimStoreID = P.DimStoreID
                      AND FP.DimBusinessDateID = P.DimBusinessDateID
CREATE PROCEDURE [dbo].[NoneOfYourBusinessWhatINamedIt] 
@ETLLoadDate DATETIME 
AS
BEGIN
DECLARE @DimCal Int = (SELECT DimCalendarID  FROM DimCalendar WHERE CalendarDate = @ETLLoadDate)
END
SELECT P.CompanyID,
       P.CompanyName,
       P.StoreID,
       P.StoreName,
       P.ReportDate,
       FP.StoreCash,
       SR.CashDeposit AS StoreResp,
       SN.StoreNet,
       P.DimEmployeeID AS EmpID,
       P.EmpName,
       P.RegisterID,
       P.PullNumber,
       ISNULL(SUM(PC.PullCash), 0) AS PullCash,
       P.PullResp  AS PullResp,
       ISNULL(SUM(PN.PullNet),0) AS PullNet
FROM   (SELECT C.CompanyID,
               C.CompanyName,
               S.StoreID,
               S.StoreName,
               F.DimEmployeeID,
               E.FirstName + ' ' + E.LastName                           AS EmpName,
               CASE
                 WHEN F.PullDrawerStartTime <> '1900-01-01' THEN F.PullDrawerStartTime
                 ELSE Isnull(Cast((SELECT TOP 1 Dateadd(SECOND, 1, PullDrawerEndTime)
                                   FROM   FactPullDrawer
                                   WHERE  PullDrawerEndTime < F.PullDrawerEndTime
                                          AND DimStoreID = F.DimStoreID
                                          AND DimRegisterID = F.DimRegisterID
                                          AND DimBusinessDateID = F.DimBusinessDateID
                                          --AND DimBusinessDateID = @DimCal
                                   ORDER  BY PullDrawerEndTime DESC) AS DATETIME), BD.CalendarDate + Isnull(Cast(Cast(ST.SiteSettingValue AS TIME) AS DATETIME), Cast('4:00:00 AM' AS DATETIME)))
               END                                                      AS PullDrawerStartTime,
               F.PullDrawerEndTime,
               BD.CalendarDate                                          AS ReportDate,
               R.RegisterID,
               R.DimRegisterID,
               (SELECT Count(PullDrawerEndTime)
                FROM   FactPullDrawer
                WHERE  PullDrawerEndTime < F.PullDrawerEndTime
                       AND DimStoreID = F.DimStoreID
                       AND DimRegisterID = F.DimRegisterID
                       AND DimBusinessDateID = F.DimBusinessDateID) + 1 AS PullNumber,
               Isnull(F.Amount, 0)                                      AS PullResp,
               F.DimStoreID,
               F.DimBusinessDateID
        FROM   FactPullDrawer AS F
               INNER JOIN DimCompany AS C
                 ON C.DimCompanyID = F.DimCompanyID
               INNER JOIN DimStore AS S
                 ON S.DimStoreID = F.DimStoreID
               INNER JOIN DimCalendar AS BD
                 ON BD.DimCalendarID = F.DimBusinessDateID
               INNER JOIN DimEmployee AS E
                 ON F.DimEmployeeID = E.DimEmployeeID
               INNER JOIN DimRegister AS R
                 ON R.DimRegisterID = F.DimRegisterID
               LEFT JOIN DimSiteSettings AS ST
           

Context

StackExchange Database Administrators Q#29891, answer score: 7

Revisions (0)

No revisions yet.