principlesqlMinor
Aggregation in Outer Apply vs Left Join vs Derived table
Viewed 0 times
leftapplyjoinderivedaggregationtableouter
Problem
Consider the following setup. There are three tables involved
Index details:
I have a requirement for which I wrote three different ways to achieve the result. Now I want to know which one is better.
All three queries are running more in more or less in same time. I need some experts advice on which one will perform better. Is there any disadvantage in any of the approach
Approach 1:
Time taken:
Query statistics:
Table 'PERIOD'. Scan count 1, logical reads 2, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob
read-ahead reads 0.
Table '#CCP_DETAILS_TEMP'. Scan count 16, logical reads 688, physical
reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads
0, lob read-ahead reads 0.
Table 'Wo
#CCP_DETAILS_TEMP,Period and ACTUALS_DETAILS#CCP_DETAILS_TEMP will have 50000 records, ACTUALS_DETAILS can have 5000000 records and period table will have 2000 recordsIndex details:
CREATE UNIQUE CLUSTERED INDEX IX_CCP_DETAILS_TEMP
ON #CCP_DETAILS_TEMP (CCP_DETAILS_SID)
CREATE NONCLUSTERED INDEX IXN_ACTUALS_DETAILS_PERIOD_SID_RS_MODEL_SID_CCP_DETAILS_SID_QUANTITY_INCLUSION
ON ACTUALS_DETAILS (PERIOD_SID, CCP_DETAILS_SID, RS_MODEL_SID, QUANTITY_INCLUSION)
INCLUDE( SALES, QUANTITY, DISCOUNT)
CREATE UNIQUE CLUSTERED INDEX IX_PERIOD
ON PERIOD (PERIOD_SID)I have a requirement for which I wrote three different ways to achieve the result. Now I want to know which one is better.
All three queries are running more in more or less in same time. I need some experts advice on which one will perform better. Is there any disadvantage in any of the approach
Approach 1:
Outer ApplyTime taken:
4615 Milli SecondsSELECT c.CCP_DETAILS_SID,
A.PERIOD_SID,
SALES,
QUANTITY
FROM #CCP_DETAILS_TEMP c
CROSS JOIN (SELECT PERIOD_SID
FROM BPIGTN_GAL_APP_DEV_ARM..PERIOD
WHERE PERIOD_SID BETWEEN 577 AND 624)A
OUTER apply (SELECT Sum(SALES),
Sum(QUANTITY)
FROM [DBO].[ACTUALS_DETAILS] ad
WHERE a.PERIOD_SID = ad.PERIOD_SID
AND ad.CCP_DETAILS_SID = c.CCP_DETAILS_SID
AND QUANTITY_INCLUSION = 'Y') oa (sales, quantity)Query statistics:
Table 'PERIOD'. Scan count 1, logical reads 2, physical reads 0,
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob
read-ahead reads 0.
Table '#CCP_DETAILS_TEMP'. Scan count 16, logical reads 688, physical
reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads
0, lob read-ahead reads 0.
Table 'Wo
Solution
For future questions please post the actual execution plans using
Paste The Plan. I think I was able to reverse engineer all of the relevant details using the screenshots and your
In all three query plans the join between
What is relevant is the table access pattern on
Table 'ACTUALS_DETAILS'. Scan count 1200000, logical reads 3859053, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ACTUALS_DETAILS'. Scan count 17, logical reads 37134, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
There's a big benefit in not reading the same pages over and over again. While it's true that the pseudo-scan approach may technically read pages that aren't needed you do a lot less IO overall. I also believe that IO difference is directly responsible for the large difference in CPU time between the first query and the other two queries: 36796 ms vs 7731 ms. While the first query ran, it on average kept 9 CPUs fully busy compared to less than 2 busy CPUs for the second and third queries. That's a big disadvantage for the first query and you'd notice it on a busy system or if your queries were forced to run with lower DOP. In my limited experience with
Queries 2 and 3 implement the join to
Personally I would not consider the difference between 4293 and 4200 ms of elapsed time or 7983 and 7731 ms time of CPU time to be statistically significant. It's possible that if you ran the queries a few more times the second query might be faster than the third query. I would use whichever style of query feels more natural to you. Personally, I would use the third query because it better represents what I want the optimizer to do, which is to perform the aggregation as early as possible.
Paste The Plan. I think I was able to reverse engineer all of the relevant details using the screenshots and your
STATISTICS output but I may have gotten a few things wrong. It looks like your plans are running with a DOP of 16, about 50000 rows are returned from #CCP_DETAILS_TEMP, and 24 rows are returned from PERIOD.In all three query plans the join between
#CCP_DETAILS_TEMP and PERIOD is performed in the same way, has the same STATISTICS output, and serves as the outer table in the join to ACTUALS_DETAILS. It looks like SQL Server is doing the right thing for that join and it's not that interesting so I'll skip that part. It's irrelevant for your comparison.What is relevant is the table access pattern on
ACTUALS_DETAILS. All three queries use index seeks on your covering index but the index seeks are performed differently. In the first query, 1200000 seeks are performed using the PERIOD_SID and CCP_DETAILS_SID columns. In the second and third queries, 17 seeks are performed using just PERIOD_SID. I believe that all of the rows are fetched with PERIOD_SID BETWEEN 577 AND 624, so that index seek can effectively be thought of as an parallel index scan that starts with PERIOD_SID = 577 and ends with PERIOD_SID = 624. That results in a big difference in IO between the queries:Table 'ACTUALS_DETAILS'. Scan count 1200000, logical reads 3859053, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'ACTUALS_DETAILS'. Scan count 17, logical reads 37134, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
There's a big benefit in not reading the same pages over and over again. While it's true that the pseudo-scan approach may technically read pages that aren't needed you do a lot less IO overall. I also believe that IO difference is directly responsible for the large difference in CPU time between the first query and the other two queries: 36796 ms vs 7731 ms. While the first query ran, it on average kept 9 CPUs fully busy compared to less than 2 busy CPUs for the second and third queries. That's a big disadvantage for the first query and you'd notice it on a busy system or if your queries were forced to run with lower DOP. In my limited experience with
APPLY I've noticed that the SQL Server query optimizer tends to implement it as a nested loop join with index seeks. This should be considered anecdotal evidence and I'm sure there are exceptions but it explains what you're seeing here.Queries 2 and 3 implement the join to
ACTUALS_DETAILS as a hash join. I assume the idea behind pushing the GROUP BY into the ad derived table was so that SQL Server would perform the aggregation early and you would join to fewer rows and aggregate fewer rows. However, SQL Server rewrote your second query to perform the aggregation early anyway. You can tell because the stream aggregate and hash match operators are to the right of the hash match (right outer join) operator in the second plan. As far as I can tell the second and third query plans are effectively the same, although the third plan does have a few extra 0% cost operators.Personally I would not consider the difference between 4293 and 4200 ms of elapsed time or 7983 and 7731 ms time of CPU time to be statistically significant. It's possible that if you ran the queries a few more times the second query might be faster than the third query. I would use whichever style of query feels more natural to you. Personally, I would use the third query because it better represents what I want the optimizer to do, which is to perform the aggregation as early as possible.
Context
StackExchange Database Administrators Q#162174, answer score: 5
Revisions (0)
No revisions yet.