patternsqlMinor
Group by range of years
Viewed 0 times
rangegroupyears
Problem
I've got a big table (~9M rows) and want to group the rows on a field containing the year. So far that's pretty easy:
We defined some irregular time periods spanning multiple years:
I've got no clue on how to group these results in the group by clause. I could make subquery's for every time period.
But that feels not right and I'm wondering if it was possible to let Postgresql do it. Especially because the query is a strong simplification of the real query: it has multiple conditions, amongst them a ST_within clause spanning four tables. So choosing the subquery-approach results in a bloated query.
Is there a better way to create this result?
// greatly simplified:
SELECT count(*), year FROM dataset GROUP BY year ORDER BY 2;We defined some irregular time periods spanning multiple years:
2005I've got no clue on how to group these results in the group by clause. I could make subquery's for every time period.
SELECT
( SELECT count(*) FROM dataset WHERE year <= 1945 AND ...... ) AS pre1945,
( ....) AS period2,
....
FROM datasetBut that feels not right and I'm wondering if it was possible to let Postgresql do it. Especially because the query is a strong simplification of the real query: it has multiple conditions, amongst them a ST_within clause spanning four tables. So choosing the subquery-approach results in a bloated query.
Is there a better way to create this result?
Solution
Use conditional counting:
This works because
With the upcoming 9.4 version you can re-write this as
select count(case when year <= 1945 then 1 end) as pre1945,
count(case when year between 1946 and 1964 then 1 end) as period2,
count(case when year between 1965 and 1974 then 1 end) as period3,
...
from ...
where ...;This works because
count() ignores null values and the case statement returns a null for values outside of the range it tests for (an else null is implicit).With the upcoming 9.4 version you can re-write this as
select count(*) filter (where year <= 1945) as pre1945,
count(*) filter (where year between 1946 and 1964) as period2,
count(*) filter (where year between 1965 and 1974) as period3,
...
from ...
where ...;Code Snippets
select count(case when year <= 1945 then 1 end) as pre1945,
count(case when year between 1946 and 1964 then 1 end) as period2,
count(case when year between 1965 and 1974 then 1 end) as period3,
...
from ...
where ...;select count(*) filter (where year <= 1945) as pre1945,
count(*) filter (where year between 1946 and 1964) as period2,
count(*) filter (where year between 1965 and 1974) as period3,
...
from ...
where ...;Context
StackExchange Database Administrators Q#86274, answer score: 8
Revisions (0)
No revisions yet.