gotchaModerate
What is the difference between `ORDER BY` and `PARTITION BY` arguments in the `OVER` clause?
Viewed 0 times
theorderpartitionwhatargumentsdifferencebetweenandclauseover
Problem
These queries below both give me exactly the same results, which I assume is because of my dataset rather than how the arguments work.
When using an
On a slightly different note, why not use the term
Query 1:
Query 2:
When using an
OVER clause, what is the difference between ORDER BY and PARTITION BY.On a slightly different note, why not use the term
GROUP BY instead of the more complicated sounding PARTITION BY, since it seems that using partitioning in this case seems to achieve the same thing as grouping. I know you can alter these inner partitions and that those changes then reflect in the table. Is that the reason?Query 1:
SELECT
CP.iYear,
AVG(AVG(CP.mUpgradeCost)) OVER(ORDER BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
GROUP BY CP.iYearQuery 2:
SELECT
CP.iYear,
AVG(AVG(CP.mUpgradeCost)) OVER(PARTITION BY iYear) AS [avg]
FROM ProForma.dbo.CapitalProject CP
GROUP BY CP.iYearSolution
The
As for query 2, are you trying to create a running average or something? Not even sure what you would expect that query to return.
If you remove
Here's an example that will hopefully explain the use of
Random table:
Now,
will return
So you can see that there are 3 rows with a=X and 2 rows with a=Y. The
... gives you:
PARTITION BY works as a "windowed group" and the ORDER BY does the ordering within the group. However, because you're using GROUP BY CP.iYear, you're effectively reducing your window to just a single row (GROUP BY is performed before the windowed function). The average of a single row will be the value of that row, in your case AVG(CP.mUpgradeCost).As for query 2, are you trying to create a running average or something? Not even sure what you would expect that query to return.
If you remove
GROUP BY CP.iYear and change AVG(AVG(...)) to just AVG(), you'll see the difference.Here's an example that will hopefully explain the use of
PARTITION BY and/or ORDER BY:Random table:
a b
----- ------
X 1001
X 1002
X 1003
Y 1001
Y 1002Now,
SELECT a, b, COUNT(*) OVER (PARTITION BY a) FROM r;will return
a b
----- ------ ------
X 1001 3
X 1002 3
X 1003 3
Y 1001 2
Y 1002 2So you can see that there are 3 rows with a=X and 2 rows with a=Y. The
ORDER BY clause comes into play when you want an ordered window function, like a row number or a running total.SELECT a, b, ROW_NUMBER() OVER (PARTITION BY a ORDER BY b),
SUM(b) OVER (PARTITION BY a ORDER BY b ROWS UNBOUNDED PRECEDING)
FROM r;... gives you:
a b
----- ------ ------ ------
X 1001 1 1001
X 1002 2 2003
X 1003 3 3006
Y 1001 1 1001 <- row number and running total resets here; new partition
Y 1002 2 2003Code Snippets
a b
----- ------
X 1001
X 1002
X 1003
Y 1001
Y 1002SELECT a, b, COUNT(*) OVER (PARTITION BY a) FROM r;a b
----- ------ ------
X 1001 3
X 1002 3
X 1003 3
Y 1001 2
Y 1002 2SELECT a, b, ROW_NUMBER() OVER (PARTITION BY a ORDER BY b),
SUM(b) OVER (PARTITION BY a ORDER BY b ROWS UNBOUNDED PRECEDING)
FROM r;a b
----- ------ ------ ------
X 1001 1 1001
X 1002 2 2003
X 1003 3 3006
Y 1001 1 1001 <- row number and running total resets here; new partition
Y 1002 2 2003Context
StackExchange Database Administrators Q#168037, answer score: 16
Revisions (0)
No revisions yet.