patternsqlMinor
Grouping categorical values in a sequence
Viewed 0 times
categoricalvaluessequencegrouping
Problem
I have a table with categorical values in a sequence, as such:
And I want to group them by the name column, getting the minimum value_1 and maximum value_2, however group them as the names appear in unbroken sequence, as such:
However, doing normal GROUP BY selection returns all the entries for a particular name together, regardless of whether they appear together or not:
Now, I could probably do this with plpgsql, using traditional programming logic with variables, loops and such, but I was wondering if there is a more SQL-flavoured approach to this.
Software: PostgreSQL 9.6.3
CREATE TABLE cat (name, v1, v2)
AS VALUES
('John', 1::int, 3::int),
('John', 3, 4),
('John', 4, 9),
('Mike', 9, 11),
('Mike', 11, 17),
('John', 17, 24),
('John', 24, 25),
('Dave', 25, 29);And I want to group them by the name column, getting the minimum value_1 and maximum value_2, however group them as the names appear in unbroken sequence, as such:
+-----------+---------+---------+
| name | value_1 | value_2 |
+-----------+---------+---------+
| John | 1 | 9 |
| Mike | 9 | 17 |
| John | 17 | 25 |
| Dave | 25 | 29 |
+-----------+---------+---------+However, doing normal GROUP BY selection returns all the entries for a particular name together, regardless of whether they appear together or not:
SELECT name, MIN(value_1), MAX(value_2)
FROM table
GROUP BY name;
+-----------+---------+---------+
| name | value_1 | value_2 |
+-----------+---------+---------+
| John | 1 | 25 |
| Mike | 9 | 17 |
| Dave | 25 | 29 |
+-----------+---------+---------+Now, I could probably do this with plpgsql, using traditional programming logic with variables, loops and such, but I was wondering if there is a more SQL-flavoured approach to this.
Software: PostgreSQL 9.6.3
Solution
Using
Then use this groups to get
name | v1 | v2
:--- | -: | -:
John | 1 | 9
Mike | 9 | 17
John | 17 | 25
Dave | 25 | 29
dbfiddle here
value_1 to order the row, you can make groups every time name changes.Then use this groups to get
max and min values.select name, min(v1) v1, max(v2) v2
from (select name, v1, v2,
sum(rst) over (order by v1) grp
from (select name, v1, v2,
case when coalesce(lag(name) over (order by v1), '') <> name then 1 end rst
from cat
) y
) z
group by name, grp
order by v1;name | v1 | v2
:--- | -: | -:
John | 1 | 9
Mike | 9 | 17
John | 17 | 25
Dave | 25 | 29
dbfiddle here
Code Snippets
select name, min(v1) v1, max(v2) v2
from (select name, v1, v2,
sum(rst) over (order by v1) grp
from (select name, v1, v2,
case when coalesce(lag(name) over (order by v1), '') <> name then 1 end rst
from cat
) y
) z
group by name, grp
order by v1;Context
StackExchange Database Administrators Q#197111, answer score: 4
Revisions (0)
No revisions yet.