snippetsqlModerate
how to calculate difference of first row and last row fields value in each group
Viewed 0 times
lasteachgroupfieldsvaluedifferencefirstcalculatehowand
Problem
I have table with structure like this:
currently i'm using group by to get min,max,avg to take hourly reports like this:
How i can calculate differences of last and first rows value in each group to generate something like:
Thanks.
+-------+------------------+
| Value | Date |
+-------+------------------+
| 10 | 10/10/2010 10:00 |
| 11 | 10/10/2010 10:15 |
| 15 | 10/10/2010 10:30 |
| 15 | 10/10/2010 10:45 |
| 17 | 10/10/2010 11:00 |
| 18 | 10/10/2010 11:15 |
| 22 | 10/10/2010 11:30 |
| 30 | 10/10/2010 11:45 |
+-------+------------------+currently i'm using group by to get min,max,avg to take hourly reports like this:
+-----+-----+-------+------------------+
| min | max | avg | Date |
+-----+-----+-------+------------------+
| 10 | 15 | 12.75 | 10/10/2010 10:00 |
| 17 | 30 | 21.75 | 10/10/2010 11:00 |
+-----+-----+-------+------------------+How i can calculate differences of last and first rows value in each group to generate something like:
+-----+-----+-------+------+------------------+
| min | max | avg | diff | Date |
+-----+-----+-------+------+------------------+
| 10 | 15 | 12.75 | 5 | 10/10/2010 10:00 |
| 17 | 30 | 21.75 | 13 | 10/10/2010 11:00 |
+-----+-----+-------+------+------------------+Thanks.
Solution
You are not showing the query you are using to obtain the results without
Also, you are not explaining what first and last mean. In this answer, it is assumed that first stands for earliest in the group (according to the
One way to throw in
First, add two more aggregated columns,
Next, join the aggregated result set back to the original table on
Note that the above assumes the
An alternative, if you are on SQL Server 2005 or later version, could be to use window aggregate functions
As you can see, the first common table expression (CTE) merely returns all rows and adds a calculated column
The second CTE adds two more columns to the above result. It uses window aggregate functions
At this point,
diff. I'm assuming it is something like this:SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value), -- or, if Value is an int, like this, perhaps:
-- AVG(CAST(Value AS decimal(10,2))
Date = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
GROUP BY
DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
;Also, you are not explaining what first and last mean. In this answer, it is assumed that first stands for earliest in the group (according to the
Date value) and, similarly, last means latest in the group.One way to throw in
diff could be like this:First, add two more aggregated columns,
minDate and maxDate, to the original query:SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value),
minDate = MIN(Date),
maxDate = MAX(Date),
Date = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
GROUP BY
DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
;Next, join the aggregated result set back to the original table on
minDate and on maxDate (separately) to access the corresponding Values:SELECT
g.min,
g.max,
g.avg,
diff = last.Value - first.Value,
g.Date
FROM (
SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value),
minDate = MIN(Date),
maxDate = MAX(Date),
Date = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
GROUP BY
DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
) g
INNER JOIN atable first ON first.Date = g.minDate
INNER JOIN atable last ON last .Date = g.maxDate
;Note that the above assumes the
Date values (at least those that happen to be first or last in their corresponding hours) to be unique, or you would get more than one row for some of the hours in the output.An alternative, if you are on SQL Server 2005 or later version, could be to use window aggregate functions
MIN() OVER (...) and MAX() OVER (...) to calculate Values corresponding to either minDate or maxDate, before aggregating all the results similarly to how you are probably doing it now. Here's what specifically I'm talking about:WITH partitioned AS (
SELECT
Value,
Date,
GroupDate = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
)
, firstlast AS (
SELECT
Value,
Date,
GroupDate,
FirstValue = CASE Date WHEN MIN(Date) OVER (PARTITION BY GroupDate) THEN Value END,
LastValue = CASE Date WHEN MAX(Date) OVER (PARTITION BY GroupDate) THEN Value END
FROM partitioned
)
SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value), -- or, again, if Value is an int, cast it as a decimal or float
diff = MAX(LastValue) - MIN(FirstValue),
Date = GroupDate
FROM firstlast
GROUP BY
GroupDate
;As you can see, the first common table expression (CTE) merely returns all rows and adds a calculated column
GroupDate, the one subsequently used for grouping/partitioning. So it essentially just assigns a name to the grouping expression, and that is done for better readability/maintainability of the entire query, as the column is later referenced more than once. This is what the first CTE produces:+-------+------------------+------------------+
| Value | Date | GroupDate |
+-------+------------------+------------------+
| 10 | 10/10/2010 10:00 | 10/10/2010 10:00 |
| 11 | 10/10/2010 10:15 | 10/10/2010 10:00 |
| 15 | 10/10/2010 10:30 | 10/10/2010 10:00 |
| 15 | 10/10/2010 10:45 | 10/10/2010 10:00 |
| 17 | 10/10/2010 11:00 | 10/10/2010 11:00 |
| 18 | 10/10/2010 11:15 | 10/10/2010 11:00 |
| 22 | 10/10/2010 11:30 | 10/10/2010 11:00 |
| 30 | 10/10/2010 11:45 | 10/10/2010 11:00 |
+-------+------------------+------------------+The second CTE adds two more columns to the above result. It uses window aggregate functions
MIN() OVER ... and MAX() OVER ... to match against Date, and where the match takes place, the corresponding Value is returned in a separate column, either FirstValue or LastValue:+-------+------------------+------------------+------------+-----------+
| Value | Date | GroupDate | FirstValue | LastValue |
+-------+------------------+------------------+------------+-----------+
| 10 | 10/10/2010 10:00 | 10/10/2010 10:00 | 10 | NULL |
| 11 | 10/10/2010 10:15 | 10/10/2010 10:00 | NULL | NULL |
| 15 | 10/10/2010 10:30 | 10/10/2010 10:00 | NULL | NULL |
| 15 | 10/10/2010 10:45 | 10/10/2010 10:00 | NULL | 15 |
| 17 | 10/10/2010 11:00 | 10/10/2010 11:00 | 17 | NULL |
| 18 | 10/10/2010 11:15 | 10/10/2010 11:00 | NULL | NULL |
| 22 | 10/10/2010 11:30 | 10/10/2010 11:00 | NULL | NULL |
| 30 | 10/10/2010 11:45 | 10/10/2010 11:00 | NULL | 30 |
+-------+------------------+------------------+------------+-----------+At this point,
Code Snippets
SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value), -- or, if Value is an int, like this, perhaps:
-- AVG(CAST(Value AS decimal(10,2))
Date = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
GROUP BY
DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
;SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value),
minDate = MIN(Date),
maxDate = MAX(Date),
Date = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
GROUP BY
DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
;SELECT
g.min,
g.max,
g.avg,
diff = last.Value - first.Value,
g.Date
FROM (
SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value),
minDate = MIN(Date),
maxDate = MAX(Date),
Date = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
GROUP BY
DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
) g
INNER JOIN atable first ON first.Date = g.minDate
INNER JOIN atable last ON last .Date = g.maxDate
;WITH partitioned AS (
SELECT
Value,
Date,
GroupDate = DATEADD(HOUR, DATEDIFF(HOUR, 0, Date), 0)
FROM atable
)
, firstlast AS (
SELECT
Value,
Date,
GroupDate,
FirstValue = CASE Date WHEN MIN(Date) OVER (PARTITION BY GroupDate) THEN Value END,
LastValue = CASE Date WHEN MAX(Date) OVER (PARTITION BY GroupDate) THEN Value END
FROM partitioned
)
SELECT
min = MIN(Value),
max = MAX(Value),
avg = AVG(Value), -- or, again, if Value is an int, cast it as a decimal or float
diff = MAX(LastValue) - MIN(FirstValue),
Date = GroupDate
FROM firstlast
GROUP BY
GroupDate
;+-------+------------------+------------------+
| Value | Date | GroupDate |
+-------+------------------+------------------+
| 10 | 10/10/2010 10:00 | 10/10/2010 10:00 |
| 11 | 10/10/2010 10:15 | 10/10/2010 10:00 |
| 15 | 10/10/2010 10:30 | 10/10/2010 10:00 |
| 15 | 10/10/2010 10:45 | 10/10/2010 10:00 |
| 17 | 10/10/2010 11:00 | 10/10/2010 11:00 |
| 18 | 10/10/2010 11:15 | 10/10/2010 11:00 |
| 22 | 10/10/2010 11:30 | 10/10/2010 11:00 |
| 30 | 10/10/2010 11:45 | 10/10/2010 11:00 |
+-------+------------------+------------------+Context
StackExchange Database Administrators Q#28482, answer score: 13
Revisions (0)
No revisions yet.