patternsqlMinor
Join is causing other column data to change?
Viewed 0 times
columnjoincausingotherdatachange
Problem
I'll lead off with saying that I am relatively new to databases and SQL and I am trying to make an addition to a
I am trying to join the
I managed to get it to work the way I would like it to by using a subquery but it is DREADFULLY slow. The query takes over a minute to run and in return, the application that is making the call for that view times out.
Here is what I currently have that works (I know, it's messy):
```
SELECT
cl.CustomerID,
cl.FirstName + ' ' + c.LastName as CustomerDisplay,
cl.InvoiceID,
cl.MealPlanID,
mp.DeliveryDate1 as DeliveryDate,
SUM(CASE WHEN cll.Name = 'Meal' THEN cll.Quantity ELSE 0 END) as RegularQuantity,
SUM(CASE WHEN cll.Name = 'Meal' THEN cll.Quantity * Amount ELSE 0 END) as RegularTotal,
SUM(CASE WHEN cll.Name = 'EXTRAPROTEINMEAL' THEN cll.Quantity ELSE 0 END) as ProteinQuantity,
SUM(CASE WHEN cll.Name = 'EXTRAPROTEINMEAL' THEN cll.Quantity * Amount ELSE 0 END) as ProteinTotal,
SUM(CASE WHEN cll.Name = 'TAX' THEN Amount ELSE 0 END) as TotalTax,
SUM(CASE WHEN cll.Name = 'SNACKS' THEN Amount ELSE 0 END) AS SnackTotal,
-- Here is my subquery that I would like to integrate into this view with a join so that it's much faster
(SELECT SUM(cs.Quantity) FROM Fustomer_Snacks as cs WHERE cs.CustomerID = cl.CustomerID and cs.DeliveryDate = mp.DeliveryDate1) as SnackQuantity,
------------------------------------------------------------------
CAST(SUM((CASE WHEN cll.Name = 'home' or cll.Name = 'work' THEN 1 ELSE 0 END)) as bit) as IsDelivery,
SUM(CASE WHEN (cll.Name = 'home' or cll.Name = 'work' or cll.Name = 'pickup') and cll.Quantity > 0 THEN cll.Quantity * Amount else 0 END) as Delive
View in SQL Server that a previous co-worker created.I am trying to join the
Customer_Snacks table so that I can get the Snack Quantity for each customer's order. The issue I'm having is when I do a LEFT OUTER JOIN which, as far as I can tell, should work, some of the other columns data is drastically changed.I managed to get it to work the way I would like it to by using a subquery but it is DREADFULLY slow. The query takes over a minute to run and in return, the application that is making the call for that view times out.
Here is what I currently have that works (I know, it's messy):
```
SELECT
cl.CustomerID,
cl.FirstName + ' ' + c.LastName as CustomerDisplay,
cl.InvoiceID,
cl.MealPlanID,
mp.DeliveryDate1 as DeliveryDate,
SUM(CASE WHEN cll.Name = 'Meal' THEN cll.Quantity ELSE 0 END) as RegularQuantity,
SUM(CASE WHEN cll.Name = 'Meal' THEN cll.Quantity * Amount ELSE 0 END) as RegularTotal,
SUM(CASE WHEN cll.Name = 'EXTRAPROTEINMEAL' THEN cll.Quantity ELSE 0 END) as ProteinQuantity,
SUM(CASE WHEN cll.Name = 'EXTRAPROTEINMEAL' THEN cll.Quantity * Amount ELSE 0 END) as ProteinTotal,
SUM(CASE WHEN cll.Name = 'TAX' THEN Amount ELSE 0 END) as TotalTax,
SUM(CASE WHEN cll.Name = 'SNACKS' THEN Amount ELSE 0 END) AS SnackTotal,
-- Here is my subquery that I would like to integrate into this view with a join so that it's much faster
(SELECT SUM(cs.Quantity) FROM Fustomer_Snacks as cs WHERE cs.CustomerID = cl.CustomerID and cs.DeliveryDate = mp.DeliveryDate1) as SnackQuantity,
------------------------------------------------------------------
CAST(SUM((CASE WHEN cll.Name = 'home' or cll.Name = 'work' THEN 1 ELSE 0 END)) as bit) as IsDelivery,
SUM(CASE WHEN (cll.Name = 'home' or cll.Name = 'work' or cll.Name = 'pickup') and cll.Quantity > 0 THEN cll.Quantity * Amount else 0 END) as Delive
Solution
The odds are that your problem is that, when you add in the snacks, you have more than one table that returns more than one row for the columns in your
When the DB engine performs a join, it matches however many rows in the first table with value ABC with however many rows in the second table have the same value, ABC (outer joins working a little differently, but let's ignore that for the moment).
I'll guess that for a given
Now, I make no promises that this will be faster than the subquery. However, it should be accurate - and it doesn't matter is something is blazing fast, if it isn't accurate.
GROUP BY statement - probably Customer_Snacks and Customer_Ledger_LineItems.When the DB engine performs a join, it matches however many rows in the first table with value ABC with however many rows in the second table have the same value, ABC (outer joins working a little differently, but let's ignore that for the moment).
I'll guess that for a given
Customer_Ledger row, you tie to one Customer row, one MealPlan row, one (or zero) PromoCode row, and multipleCustomer_Ledger_LineItems rows.
I'm also guessing for one Customer_Ledger row and its associated MealPlan row, you can bring back multiple Customer_Snacks rows.
So, when we put all the rows together, let's say we have:
- 1
Customer row
- 1
Customer_Ledger row
- 1
MealPlan row
- 1
PromoCode row
- 3
Customer_Ledger_LineItems rows
Match all those together, you get 1 1 1 1 3 = 3 rows.
Now, add in 2 Customer_Snacks rows. those are matched against all the rows we already have, so we get 1 1 1 1 3 * 2 = 6 rows.
When we go to take our sums, we'll get double the amount for sums from Customer_Ledger_LineItems rows, because each row now shows up twice. And, we'll get triple the amount for the sum from Customer_Snacks, because those rows each now show up three times. Basically, you're getting a CROSS JOIN between the relevant Customer_Ledger_LineItems rows, and the Customer_Snacks rows.
There are two ways to avoid this. You already employing the first option - use a subquery to get the sums from one of the tables returning multiple rows to our grouped values.
The other option is very similar: build a temporary table (or a CTE) with all relevant values that you need, so that you'll just have one row with the sums you need when you go to join the data in. In your case, this would look like:
;WITH SnackSums AS
(SELECT cs.CustomerID
,cs.DeliveryDate
,SUM(cs.Quantity) as SnackQuantity
FROM Customer_Snacks cs
GROUP BY cs.CustomerID, cs.DeliveryDate
)
and would go right before the SELECT for your query. Then, you just put ss.SnackQuantity where the subquery was, and add:
LEFT OUTER JOIN
SnackSums ss
ON (cl.CustomerId = ss.CustomerId AND ss.DeliveryDate = mp.DeliveryDate1)
right before your GROUP BY. Finally, you'll also need to add ss.SnackQuantity to the GROUP BY`, since in the context of your main query this is a stand-alone, non-aggregated value.Now, I make no promises that this will be faster than the subquery. However, it should be accurate - and it doesn't matter is something is blazing fast, if it isn't accurate.
Code Snippets
;WITH SnackSums AS
(SELECT cs.CustomerID
,cs.DeliveryDate
,SUM(cs.Quantity) as SnackQuantity
FROM Customer_Snacks cs
GROUP BY cs.CustomerID, cs.DeliveryDate
)LEFT OUTER JOIN
SnackSums ss
ON (cl.CustomerId = ss.CustomerId AND ss.DeliveryDate = mp.DeliveryDate1)Context
StackExchange Database Administrators Q#177050, answer score: 8
Revisions (0)
No revisions yet.