patternsqlMinor
Removing duplicate entries for each hour
Viewed 0 times
eachremovingduplicatehourforentries
Problem
I recently asked the same question about MySQL, which has since been migrated to Postgres.
Old question.
To sum it up:
I have a table with an id (type UUID), a timestamp, and some text (row3). How do I retain only one entry per hour for each distinct value of row3?
What I tried is this:
But this, unfortunately, inserts every row from table1.
Old question.
To sum it up:
I have a table with an id (type UUID), a timestamp, and some text (row3). How do I retain only one entry per hour for each distinct value of row3?
What I tried is this:
INSERT INTO log_table
SELECT * FROM table1
WHERE id IN (
SELECT DISTINCT ON(id) id
FROM table1
GROUP BY row3, EXTRACT(HOUR FROM "time"), id
);
TRUNCATE table1;But this, unfortunately, inserts every row from table1.
Solution
Generic case
Say you want to retain the values belonging the
(This will omit the date from your data, which is what you may want—or not. In the latter case you may want to use
Then you can use the above query to populate a table just as you did in your question, or delete everything else from
UUID
There is no
This one applies a window function, and the grouping is achieved by the
(Acknowledgment: thanks for the idea for AndriyM!)
Say you want to retain the values belonging the
max(id) for every hour and for every distinct row3 value (but see the UUID section below). The IDs of these you get with a query likeSELECT max(id)
FROM table1
GROUP BY row3, EXTRACT(hour FROM "time")
;(This will omit the date from your data, which is what you may want—or not. In the latter case you may want to use
date_trunc('hour', "time") instead.)Then you can use the above query to populate a table just as you did in your question, or delete everything else from
table1 likeDELETE FROM table1
WHERE id NOT IN (
[the query above comes here]
)
;UUID
There is no
max() for UUID. Therefore another solution is needed in your case, like this one:SELECT DISTINCT first_value(id) OVER w
FROM table1
WINDOW w AS (PARTITION BY row3, EXTRACT(hour FROM "time")
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
;This one applies a window function, and the grouping is achieved by the
PARTITION BY clause and DISTINCT.(Acknowledgment: thanks for the idea for AndriyM!)
Code Snippets
SELECT max(id)
FROM table1
GROUP BY row3, EXTRACT(hour FROM "time")
;DELETE FROM table1
WHERE id NOT IN (
[the query above comes here]
)
;SELECT DISTINCT first_value(id) OVER w
FROM table1
WINDOW w AS (PARTITION BY row3, EXTRACT(hour FROM "time")
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
;Context
StackExchange Database Administrators Q#29854, answer score: 5
Revisions (0)
No revisions yet.