gotchasqlMinor
How to calculate the cumulative difference between rows in Postgres?
Viewed 0 times
rowsthepostgrescumulativedifferencebetweencalculatehow
Problem
I am in need of calculating the difference from the cumulative values for the previous day in Postgres based on session_id. Example as mentioned below:
Expected Results:
```
Date session_id upload_usage download_usage total_usage_on_a_day Extected_difference
10/21/2014 0007994b 37578561 6800209 44378770 44378770
10/22/2014 0007994b 218113296 85272007 303385303 259006533
10/23/2014 0007994b 552228616 252390680 804619296 501233993
10/24/2014 0007994b 799772020 391196041 1190968061 386348765
10/25/2014 0007994b 1047233978 529908804 1577142782 386174721
10/26/2014 0007994b 1294608258 668515778 1963124036 385981254
10/27/2014 0007994b 1066656794 557318645 2573613674 610489638
10/27/2014 00079e4e 12949219 7265243 20214462 20214462
10/28/2014 00079e4e 203871297 114308
CREATE TEMP TABLE foo AS
SELECT date::date, session_id, upload_usage, download_usage, total_usage_on_a_day
FROM ( VALUES
( '10/21/2014', '0007994b', 37578561 , 6800209 , 44378770 ),
( '10/22/2014', '0007994b', 218113296 , 85272007 , 303385303 ),
( '10/23/2014', '0007994b', 552228616 , 252390680 , 804619296 ) ,
( '10/24/2014', '0007994b', 799772020 , 391196041 , 1190968061 ),
( '10/25/2014', '0007994b', 1047233978 , 529908804 , 1577142782 ),
( '10/26/2014', '0007994b', 1294608258 , 668515778 , 1963124036 ),
( '10/27/2014', '0007994b', 1066656794 , 557318645 , 2573613674 ),
( '10/27/2014', '00079e4e', 12949219 , 7265243 , 20214462 ),
( '10/28/2014', '00079e4e', 203871297 , 114308478 , 318179775 ),
( '10/29/2014', '00079e4e', 445466682 , 251486943 , 696953625 ),
( '10/30/2014', '00079e4e', 183499477 , 109643736 , 893143213 )
) AS t( date, session_id, upload_usage, download_usage, total_usage_on_a_day );Expected Results:
```
Date session_id upload_usage download_usage total_usage_on_a_day Extected_difference
10/21/2014 0007994b 37578561 6800209 44378770 44378770
10/22/2014 0007994b 218113296 85272007 303385303 259006533
10/23/2014 0007994b 552228616 252390680 804619296 501233993
10/24/2014 0007994b 799772020 391196041 1190968061 386348765
10/25/2014 0007994b 1047233978 529908804 1577142782 386174721
10/26/2014 0007994b 1294608258 668515778 1963124036 385981254
10/27/2014 0007994b 1066656794 557318645 2573613674 610489638
10/27/2014 00079e4e 12949219 7265243 20214462 20214462
10/28/2014 00079e4e 203871297 114308
Solution
You can use
LAG() function in a query with GROUP BY just like any other query. The only difference is that the columns allowed in the window (OVER) and in the LAG are the ones allowed in SELECT after a GROUP BY:select
date,
session_id,
sum(upload_usage) as upload,
sum(download_usage) as download,
sum(total_usage_on_a_day) as total_as_cumm,
sum(total_usage_on_a_day)
- coalesce(lag(sum(total_usage_on_a_day)) over (partition by session_id order by date), 0)
as expected_difference
from jiodba.s_crc_zda_mon_conn_usage
where gpart = '1100043958'
and zzaccess_ntwk_id = 'FTTH'
group by session_id, date
order by session_id, date
limit 100 ;- I'm not sure why you used
max(usage)while you name the columntotal_usage. I changed it to usesum()instead. If you need to usemax(usage)there, pick a more appropriate name for that column.
Code Snippets
select
date,
session_id,
sum(upload_usage) as upload,
sum(download_usage) as download,
sum(total_usage_on_a_day) as total_as_cumm,
sum(total_usage_on_a_day)
- coalesce(lag(sum(total_usage_on_a_day)) over (partition by session_id order by date), 0)
as expected_difference
from jiodba.s_crc_zda_mon_conn_usage
where gpart = '1100043958'
and zzaccess_ntwk_id = 'FTTH'
group by session_id, date
order by session_id, date
limit 100 ;Context
StackExchange Database Administrators Q#84505, answer score: 6
Revisions (0)
No revisions yet.