patternsqlModerate
Get rows with most recent date for each different item
Viewed 0 times
itemrowsrecenteachwithdatedifferentgetformost
Problem
Let's say this is the sample date coming from a join of 2 tables. Database is Postgres 9.6
I want to know if it's possible in a optimized way to:
This means:
This means:
A few thoughts:
-
For first question I could obtain the
-
id product_id invoice_id amount date
1 PROD1 INV01 2 01-01-2018
2 PROD2 INV02 3 01-01-2018
3 PROD1 INV01 2 05-01-2018
4 PROD1 INV03 1 05-01-2018
5 PROD2 INV02 3 08-01-2018
6 PROD2 INV04 4 08-01-2018I want to know if it's possible in a optimized way to:
- Get all the PRODx with their respective INVx which have the latest date, but per product_id. Please note that records unused from a day may be reported to a new one.
This means:
id product_id invoice_id amount date
3 PROD1 INV01 2 05-01-2018
4 PROD1 INV03 1 05-01-2018
5 PROD2 INV02 3 08-01-2018
6 PROD2 INV04 4 08-01-2018- Get daily summed amounts for each PRODx but fill the gaps with the previous ones if day does not exist.
This means:
product_id amount date
PROD1 2 01-01-2018
PROD2 3 01-01-2018
PROD1 2 02-01-2018
PROD2 3 02-01-2018
PROD1 2 03-01-2018
PROD2 3 03-01-2018
PROD1 2 04-01-2018
PROD2 3 04-01-2018
PROD1 3 05-01-2018
PROD2 3 05-01-2018
PROD1 3 06-01-2018
PROD2 3 06-01-2018
PROD1 3 07-01-2018
PROD2 3 07-01-2018
PROD1 3 08-01-2018
PROD2 7 08-01-2018A few thoughts:
-
For first question I could obtain the
max(date) for each PRODx and the pick for each PRODx the rows that have the date=with max(date) but I was wondering if there's faster way to obtain this given a large number of recors in the database-
Solution
Skinning Q#1 independently and slightly differently than @ypercube
For Q#2, you are on the right track, but the SQL will have a cross join(gasp!)
I think a function with a loop/cursor would be more optimized (i'll try that in my next free block of time)
with cte as (select row_number() over (partition by product_id,
invoice_id
order by dt desc) as rn,
product_id,
invoice_id,
amount,dt
from product )
select product_id, invoice_id,amount,dt
from cte
where rn=1
order by product_id,invoice_id;
product_id | invoice_id | amount | dt
------------+------------+--------+------------
PROD1 | INV01 | 2 | 2018-01-05
PROD1 | INV03 | 1 | 2018-01-05
PROD2 | INV02 | 3 | 2018-01-08
PROD2 | INV04 | 4 | 2018-01-08
(4 rows)For Q#2, you are on the right track, but the SQL will have a cross join(gasp!)
I think a function with a loop/cursor would be more optimized (i'll try that in my next free block of time)
--the cte will give us the real values
with cte as (select product_id,
sum(amount) as amount,
dt
from product
group by product_id,dt)
select p.product_id,
(select cte.amount --choose the amount
from cte
where cte.product_id = p.product_id
and cte.dt <= d.gdt -- for same day or earlier
order by cte.dt desc
limit 1) as finamt,
d.gdt
from (select generate_series( (select min(dt)
from product), --where clause if some products
--don't have an amount
(select max(dt)
from product),
'1 day'
)::date as gdt) d
cross join --assuming each listed product has an amount on the min date
(select distinct product_id
from product) p
left join --since we need to fill the gaps
cte on ( d.gdt = cte.dt
and p.product_id = cte.product_id)
order by d.gdt, p.product_id
;Code Snippets
with cte as (select row_number() over (partition by product_id,
invoice_id
order by dt desc) as rn,
product_id,
invoice_id,
amount,dt
from product )
select product_id, invoice_id,amount,dt
from cte
where rn=1
order by product_id,invoice_id;
product_id | invoice_id | amount | dt
------------+------------+--------+------------
PROD1 | INV01 | 2 | 2018-01-05
PROD1 | INV03 | 1 | 2018-01-05
PROD2 | INV02 | 3 | 2018-01-08
PROD2 | INV04 | 4 | 2018-01-08
(4 rows)--the cte will give us the real values
with cte as (select product_id,
sum(amount) as amount,
dt
from product
group by product_id,dt)
select p.product_id,
(select cte.amount --choose the amount
from cte
where cte.product_id = p.product_id
and cte.dt <= d.gdt -- for same day or earlier
order by cte.dt desc
limit 1) as finamt,
d.gdt
from (select generate_series( (select min(dt)
from product), --where clause if some products
--don't have an amount
(select max(dt)
from product),
'1 day'
)::date as gdt) d
cross join --assuming each listed product has an amount on the min date
(select distinct product_id
from product) p
left join --since we need to fill the gaps
cte on ( d.gdt = cte.dt
and p.product_id = cte.product_id)
order by d.gdt, p.product_id
;Context
StackExchange Database Administrators Q#190815, answer score: 16
Revisions (0)
No revisions yet.