patternsqlMinor
Grouping by sequential relationships with Postgres
Viewed 0 times
postgresgroupingwithsequentialrelationships
Problem
This is quite similar to this question about continuous ranges, and this question about grouping by sequential numbers, but differs in that the sequences are not numeric. Given the following relationships as key pairs
This is the table with example data (also on SQLFiddle):
What is the most efficient way to produce an output that looks like this:
or this:
I have thought about doing this operation outside of Postgres, but it seems like there must be a way to achieve this result with either a window function or PL/pgSQL.
a -- b -- c e -- f -- g
| /
| /
dThis is the table with example data (also on SQLFiddle):
CREATE TABLE relationships (
name varchar(1),
related varchar(1)
);
INSERT INTO relationships (name, related) VALUES
('a', 'a'),
('a', 'b'),
('b', 'b'),
('b', 'a'),
('b', 'c'),
('b', 'd'),
('c', 'c'),
('c', 'b'),
('c', 'd'),
('d', 'd'),
('d', 'c'),
('d', 'b'),
('e', 'e'),
('e', 'f'),
('f', 'f'),
('f', 'e'),
('f', 'g'),
('g', 'g');What is the most efficient way to produce an output that looks like this:
| group | members |
------------------------|
| 1 | {a, b, c, d} |
| 2 | {e, f, g} |or this:
| name | group |
-----------------
| a | 1 |
| b | 1 |
| c | 1 |
| d | 1 |
| e | 2 |
| f | 2 |
| g | 2 |I have thought about doing this operation outside of Postgres, but it seems like there must be a way to achieve this result with either a window function or PL/pgSQL.
Solution
Ugly, but it works!
First, define
and then run the query
Which produces the following:
SQLFiddle here - http://sqlfiddle.com/#!15/c8a5b/20. I'm a novice with this kind of query, so please let me know if there is a more efficient way to do this!
First, define
array_agg_mult from this questionCREATE AGGREGATE array_agg_mult (anyarray) (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND = '{}'
);and then run the query
WITH summary AS (
SELECT name, array_agg(related) AS touches
FROM relationships
GROUP BY name
),
grouped AS (
SELECT name, (
SELECT array_agg(uniques) FROM (
select distinct unnest(array_agg_mult(sub.touches)) AS uniques
ORDER BY uniques
) x
) my_group
FROM summary LEFT JOIN LATERAL (
SELECT touches
FROM summary r
WHERE summary.touches && r.touches
GROUP BY name, touches
) sub ON true
GROUP BY summary.name
ORDER BY summary.name
)
SELECT DISTINCT my_group, row_number() over() as group_id
FROM grouped
GROUP BY my_group;Which produces the following:
| my_group | group_id |
| {a,b,c,d} | 2 |
| {e,f,g} | 1 |SQLFiddle here - http://sqlfiddle.com/#!15/c8a5b/20. I'm a novice with this kind of query, so please let me know if there is a more efficient way to do this!
Code Snippets
CREATE AGGREGATE array_agg_mult (anyarray) (
SFUNC = array_cat
,STYPE = anyarray
,INITCOND = '{}'
);WITH summary AS (
SELECT name, array_agg(related) AS touches
FROM relationships
GROUP BY name
),
grouped AS (
SELECT name, (
SELECT array_agg(uniques) FROM (
select distinct unnest(array_agg_mult(sub.touches)) AS uniques
ORDER BY uniques
) x
) my_group
FROM summary LEFT JOIN LATERAL (
SELECT touches
FROM summary r
WHERE summary.touches && r.touches
GROUP BY name, touches
) sub ON true
GROUP BY summary.name
ORDER BY summary.name
)
SELECT DISTINCT my_group, row_number() over() as group_id
FROM grouped
GROUP BY my_group;| my_group | group_id |
| {a,b,c,d} | 2 |
| {e,f,g} | 1 |Context
StackExchange Database Administrators Q#107190, answer score: 2
Revisions (0)
No revisions yet.