patternsqlModerate
Calculate rolling sum over 7 consecutive day period on PostgreSQL
Viewed 0 times
postgresqlperiodrollingdaycalculatesumoverconsecutive
Problem
I need to get the rolling sum over a period of 7 days for each row (1 row per day).
For example:
I need this in one query that returns the rows with 7-day rolling sum and the date of the last day of the range of the sum. For example, day=2016-02-10, sum 17.
So far I have this but it's not fully working:
I am using PostgreSQL 9.4.5. There could be multiple rows with the same date. If there is a gap (a day is missing), the 7-consecutive-day range will still be followed.
For example:
| Date | Count | 7-Day Rolling Sum |
------------------------------------------
| 2016-02-01 | 1 | 1
| 2016-02-02 | 1 | 2
| 2016-02-03 | 2 | 4
| 2016-02-04 | 2 | 6
| 2016-02-05 | 2 | 8
| 2016-02-06 | 2 | 10
| 2016-02-07 | 2 | 12
| 2016-02-08 | 2 | 13 --> here we start summing from 02-02
| 2016-02-09 | 2 | 14 --> here we start summing from 02-03
| 2016-02-10 | 5 | 17 --> here we start summing from 02-04
I need this in one query that returns the rows with 7-day rolling sum and the date of the last day of the range of the sum. For example, day=2016-02-10, sum 17.
So far I have this but it's not fully working:
DO
$do$
DECLARE
curr_date date;
num bigint;
BEGIN
FOR curr_date IN (SELECT date_trunc('day', d)::date FROM generate_series(CURRENT_DATE-31, CURRENT_DATE-1, '1 day'::interval) d)
LOOP
SELECT curr_date, SUM(count)
FROM generate_series (curr_date-8, curr_date-1, '1 day'::interval) d
LEFT JOIN m.ping AS p ON p.date = d
LEFT JOIN m.ping_type AS pt ON pt.id = p.ping_type_id
LEFT JOIN m.ping_frequency AS pf ON pf.id = p.ping_frequency_id
WHERE
pt.url_slug = 'active' AND
pf.url_slug = 'weekly';
END LOOP;
END
$do$;I am using PostgreSQL 9.4.5. There could be multiple rows with the same date. If there is a gap (a day is missing), the 7-consecutive-day range will still be followed.
Solution
By far the cleanest solution is to use window function
The important part is to generate the timeframe in
Example
For example, if I create some test data with 20 records in last 14 days:
And also add a value before that:
Then use the query above:
And get the results for whole month:
sum with rows between:with days as (
SELECT date_trunc('day', d)::date as day
FROM generate_series(CURRENT_DATE-31, CURRENT_DATE-1, '1 day'::interval) d ),
counts as (
select
days.day,
sum((random()*5)::integer) num
FROM days
-- left join other tables here to get counts, I'm using random
group by days.day
)
select
day,
num,
sum(num) over (order by day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
from counts
order by day;The important part is to generate the timeframe in
days CTE and join onto it in order not to miss any days for which there is no data.Example
For example, if I create some test data with 20 records in last 14 days:
SELECT (current_date - ((random()*14)::integer::text || 'days')::interval)::date as day, (random()*7)::integer as num
into test_data from generate_series(1, 20);;And also add a value before that:
insert into test_data values ((current_date - '25 days'::interval), 5);Then use the query above:
with days as (
SELECT date_trunc('day', d)::date as day
FROM generate_series(CURRENT_DATE-31, CURRENT_DATE-1, '1 day'::interval) d ),
counts as (
select
days.day,
sum(t.num) num
FROM days
left join test_data t on t.day = days.day
group by days.day
)
select
day,
num,
sum(num) over (order by day rows between 6 preceding and current row)
from counts
order by day;And get the results for whole month:
day | num | sum
------------+-----+-----
2016-01-31 | |
2016-02-01 | |
2016-02-02 | |
2016-02-03 | |
2016-02-04 | |
2016-02-05 | |
2016-02-06 | 5 | 5
2016-02-07 | | 5
2016-02-08 | | 5
2016-02-09 | | 5
2016-02-10 | | 5
2016-02-11 | | 5
2016-02-12 | | 5
2016-02-13 | |
2016-02-14 | |
2016-02-15 | |
2016-02-16 | |
2016-02-17 | |
2016-02-18 | 2 | 2
2016-02-19 | 5 | 7
2016-02-20 | | 7
2016-02-21 | 4 | 11
2016-02-22 | 15 | 26
2016-02-23 | 1 | 27
2016-02-24 | 1 | 28
2016-02-25 | 2 | 28
2016-02-26 | 4 | 27
2016-02-27 | 9 | 36
2016-02-28 | 5 | 37
2016-02-29 | 11 | 33
2016-03-01 | 5 | 37
(31 rows)Code Snippets
with days as (
SELECT date_trunc('day', d)::date as day
FROM generate_series(CURRENT_DATE-31, CURRENT_DATE-1, '1 day'::interval) d ),
counts as (
select
days.day,
sum((random()*5)::integer) num
FROM days
-- left join other tables here to get counts, I'm using random
group by days.day
)
select
day,
num,
sum(num) over (order by day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW)
from counts
order by day;SELECT (current_date - ((random()*14)::integer::text || 'days')::interval)::date as day, (random()*7)::integer as num
into test_data from generate_series(1, 20);;insert into test_data values ((current_date - '25 days'::interval), 5);with days as (
SELECT date_trunc('day', d)::date as day
FROM generate_series(CURRENT_DATE-31, CURRENT_DATE-1, '1 day'::interval) d ),
counts as (
select
days.day,
sum(t.num) num
FROM days
left join test_data t on t.day = days.day
group by days.day
)
select
day,
num,
sum(num) over (order by day rows between 6 preceding and current row)
from counts
order by day;day | num | sum
------------+-----+-----
2016-01-31 | |
2016-02-01 | |
2016-02-02 | |
2016-02-03 | |
2016-02-04 | |
2016-02-05 | |
2016-02-06 | 5 | 5
2016-02-07 | | 5
2016-02-08 | | 5
2016-02-09 | | 5
2016-02-10 | | 5
2016-02-11 | | 5
2016-02-12 | | 5
2016-02-13 | |
2016-02-14 | |
2016-02-15 | |
2016-02-16 | |
2016-02-17 | |
2016-02-18 | 2 | 2
2016-02-19 | 5 | 7
2016-02-20 | | 7
2016-02-21 | 4 | 11
2016-02-22 | 15 | 26
2016-02-23 | 1 | 27
2016-02-24 | 1 | 28
2016-02-25 | 2 | 28
2016-02-26 | 4 | 27
2016-02-27 | 9 | 36
2016-02-28 | 5 | 37
2016-02-29 | 11 | 33
2016-03-01 | 5 | 37
(31 rows)Context
StackExchange Database Administrators Q#130949, answer score: 16
Revisions (0)
No revisions yet.