snippetsqlMinor
How to turn a set of flat trees into a single tree with multiple leaves?
Viewed 0 times
turnwithintoleavestreesflatsinglemultiplehowset
Problem
We have this beautiful Postgres tree generator. Yet it kind of produces cuts of a tree not a whole tree all at once:
I want to get a single tree object out from it with an array like this:
Here is the generator:
```
CREATE TABLE items (
item_id serial PRIMARY KEY,
title text
);
CREATE TABLE joins (
id serial PRIMARY KEY,
item_id int,
child_id int
);
INSERT INTO items (item_id,title) VALUES
(1,'PARENT'),
(2,'LEVEL 2'),
(3,'LEVEL 3.1'),
(4,'LEVEL 4.1'),
(5,'LEVEL 4.2'),
(6,'LEVEL 3.2');
INSERT INTO joins (item_id, child_id) VALUES
(1,2),
(2,3),
(3,4),
(
item_id jsonb_pretty
1 {
"title": "PARENT",
"item_id": 1,
"children": {
"title": "LEVEL 2",
"item_id": 2,
"children": {
"title": "LEVEL 3.2",
"item_id": 6
}
}
}
1 {
"title": "PARENT",
"item_id": 1,
"children": {
"title": "LEVEL 2",
"item_id": 2,
"children": {
"title": "LEVEL 3.1",
"item_id": 3,
"children": {
"title": "LEVEL 4.1",
"item_id": 4
}
}
}
}
1 {
"title": "PARENT",
"item_id": 1,
"children": {
"title": "LEVEL 2",
"item_id": 2,
"children": {
"title": "LEVEL 3.1",
"item_id": 3,
"children": {
"title": "LEVEL 4.2",
"item_id": 5
}
}
}
}
I want to get a single tree object out from it with an array like this:
1 {
"title": "PARENT",
"item_id": 1,
"children": [{
"title": "LEVEL 2",
"item_id": 2,
"children": [{
"title": "LEVEL 3.2",
"item_id": 6
},
{
"title": "LEVEL 3.1",
"item_id": 3,
"children": [{
"title": "LEVEL 4.1",
"item_id": 4
},
{
"title": "LEVEL 4.2",
"item_id": 5
}]
}]
}]
}
Here is the generator:
```
CREATE TABLE items (
item_id serial PRIMARY KEY,
title text
);
CREATE TABLE joins (
id serial PRIMARY KEY,
item_id int,
child_id int
);
INSERT INTO items (item_id,title) VALUES
(1,'PARENT'),
(2,'LEVEL 2'),
(3,'LEVEL 3.1'),
(4,'LEVEL 4.1'),
(5,'LEVEL 4.2'),
(6,'LEVEL 3.2');
INSERT INTO joins (item_id, child_id) VALUES
(1,2),
(2,3),
(3,4),
(
Solution
A recursive CTE (rCTE) is the right tool to follow the path and identify all nodes of a tree.
But it does not allow aggregation in the recursive term. There is an elegant alternative:
Recursive function
db<>fiddle here
Call with default max levels (7):
Produces your desired result:
Call with just 3 levels:
Note the special output when the recursion is stopped prematurely:
The outer
You can wrap outer call into another function or cram it into the same, possibly with PL/pgSQL and switch with
This is actual recursion, while rCTE really just iterate. You need to understand the concept to understand the function: it calls itself. Decrementing the maximum
I built in special behavior for the last recursion level: in case the max level cuts off more children, the function inserts a count and a hint.
The (optional!) call with
Deletes all object fields that have null values from the given JSON value, recursively.
Removes empty children fields (with
Ideally, you have an index on
Related:
Plain SQL
While there is only a hand full of levels, you could also spell out a cascade of subqueries or non-recursive CTEs with aggregation.
The query gets rather verbose, but performance should still be good when compared to your original. Due in no small part to the fact that I reverted the initial approach to the query. Since you are interested in a given root ID, it should be much more efficient to start at the root instead of the leaves - sticking to the tree as metaphor. This way, we only ever select rows that are part of the tree.
The way you had it, all rows of the whole table(s) are being processed, only to filter the one tree stemming from your given root in the end. That's a lot of wasted effort.
Also, I join to
After selecting all nodes of the tree, the aggregation needs to be top-down again, from leaves to the root, since each lower level integrates the aggregated array from the next higher level.
Query for a maximum of 4 levels to cover your test data:
```
WITH RECURSIVE path AS (
SELECT ARRAY[item_id, child_id] AS path, child
But it does not allow aggregation in the recursive term. There is an elegant alternative:
Recursive function
CREATE OR REPLACE FUNCTION f_item_tree(_item_id int, _level int = 7)
RETURNS jsonb
LANGUAGE sql STABLE PARALLEL SAFE AS
$func$
SELECT CASE WHEN _level > 1
THEN jsonb_agg(sub)
ELSE CASE WHEN count(*) > 0 THEN to_jsonb(count(*) || ' - stopped recursion at max level') END
END
FROM (
SELECT i.*, CASE WHEN _level > 1 THEN f_item_tree(i.item_id, _level - 1) END AS children
FROM joins j
JOIN items i ON i.item_id = j.child_id
WHERE j.item_id = _item_id
ORDER BY i.item_id
) sub
$func$;db<>fiddle here
Call with default max levels (7):
SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM (
SELECT *, f_item_tree(item_id) AS children
FROM items
WHERE item_id = 1 -- root item_id HERE
) sub;Produces your desired result:
{
"title": "PARENT",
"item_id": 1,
"children": [
{
"title": "LEVEL 2",
"item_id": 2,
"children": [
{
"title": "LEVEL 3.1",
"item_id": 3,
"children": [
{
"title": "LEVEL 4.1",
"item_id": 4
},
{
"title": "LEVEL 4.2",
"item_id": 5
}
]
},
{
"title": "LEVEL 3.2",
"item_id": 6
}
]
}
]
}
Call with just 3 levels:
SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM (
SELECT *, f_item_tree(item_id, 3) AS children -- max level HERE
FROM items
WHERE item_id = 1 -- root item_id HERE
) sub;Note the special output when the recursion is stopped prematurely:
{
"title": "PARENT",
"item_id": 1,
"children": [
{
"title": "LEVEL 2",
"item_id": 2,
"children": [
{
"title": "LEVEL 3.1",
"item_id": 3,
"children": "2 - stopped recursion at max level"
},
{
"title": "LEVEL 3.2",
"item_id": 6
}
]
}
]
}
The outer
jsonb_pretty() is optional, of course.You can wrap outer call into another function or cram it into the same, possibly with PL/pgSQL and switch with
IF. That's a matter of taste and style. I kept it simple here.This is actual recursion, while rCTE really just iterate. You need to understand the concept to understand the function: it calls itself. Decrementing the maximum
_level with every call is what eventually terminates recursion.I built in special behavior for the last recursion level: in case the max level cuts off more children, the function inserts a count and a hint.
The (optional!) call with
jsonb_strip_nulls() ...Deletes all object fields that have null values from the given JSON value, recursively.
Removes empty children fields (with
NULL value). If there can be additional fields with NULL values you want to keep, consider jsonb_set_lax() in the function instead, like used below with the pure SQL solution.Ideally, you have an index on
joins(item_id, child_id) - or a UNIQUE constraint on (item_id, child_id) which comes with that index implicitly.Related:
- For each category, find the count of foreign-key items in all child categories using a PostgreSQL Recursive CTE
Plain SQL
While there is only a hand full of levels, you could also spell out a cascade of subqueries or non-recursive CTEs with aggregation.
The query gets rather verbose, but performance should still be good when compared to your original. Due in no small part to the fact that I reverted the initial approach to the query. Since you are interested in a given root ID, it should be much more efficient to start at the root instead of the leaves - sticking to the tree as metaphor. This way, we only ever select rows that are part of the tree.
The way you had it, all rows of the whole table(s) are being processed, only to filter the one tree stemming from your given root in the end. That's a lot of wasted effort.
Also, I join to
items once after having selected all relevant nodes, not every time in the recursive term like you had it.After selecting all nodes of the tree, the aggregation needs to be top-down again, from leaves to the root, since each lower level integrates the aggregated array from the next higher level.
Query for a maximum of 4 levels to cover your test data:
```
WITH RECURSIVE path AS (
SELECT ARRAY[item_id, child_id] AS path, child
Code Snippets
CREATE OR REPLACE FUNCTION f_item_tree(_item_id int, _level int = 7)
RETURNS jsonb
LANGUAGE sql STABLE PARALLEL SAFE AS
$func$
SELECT CASE WHEN _level > 1
THEN jsonb_agg(sub)
ELSE CASE WHEN count(*) > 0 THEN to_jsonb(count(*) || ' - stopped recursion at max level') END
END
FROM (
SELECT i.*, CASE WHEN _level > 1 THEN f_item_tree(i.item_id, _level - 1) END AS children
FROM joins j
JOIN items i ON i.item_id = j.child_id
WHERE j.item_id = _item_id
ORDER BY i.item_id
) sub
$func$;SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM (
SELECT *, f_item_tree(item_id) AS children
FROM items
WHERE item_id = 1 -- root item_id HERE
) sub;SELECT jsonb_pretty(jsonb_strip_nulls(to_jsonb(sub))) AS tree
FROM (
SELECT *, f_item_tree(item_id, 3) AS children -- max level HERE
FROM items
WHERE item_id = 1 -- root item_id HERE
) sub;WITH RECURSIVE path AS (
SELECT ARRAY[item_id, child_id] AS path, child_id AS item_id, 2 AS lvl
FROM joins
WHERE item_id = 1 -- root item_id HERE
UNION ALL
SELECT p.path || j.child_id, j.child_id, p.lvl + 1
FROM path p
JOIN joins j ON j.item_id = p.item_id
WHERE p.lvl < 7 -- HARD limit
)
, tree AS (
SELECT p.*, to_jsonb(i) AS item
FROM path p
JOIN items i USING (item_id)
ORDER BY lvl, item_id
)
, lvl4 AS ( -- leaf!
SELECT path[3] AS item_id, jsonb_agg(item) AS children
FROM tree
WHERE lvl = 4
GROUP BY 1
)
, lvl3 AS (
SELECT path[2] AS item_id
, jsonb_agg(jsonb_set_lax(item, '{children}', children, true, 'return_target')) AS children
FROM tree t
LEFT JOIN lvl4 USING (item_id)
WHERE lvl = 3
GROUP BY 1
)
, lvl2 AS ( -- stem!
SELECT jsonb_agg(jsonb_set_lax(item, '{children}', children, true, 'return_target')) AS children
FROM tree t
LEFT JOIN lvl3 USING (item_id)
WHERE lvl = 2
)
SELECT i.item_id
, jsonb_pretty(jsonb_set_lax(to_jsonb(i), '{children}', children, true, 'return_target')) AS tree
FROM items i
LEFT JOIN lvl2 ON true
WHERE item_id = 1; -- root item_id HEREContext
StackExchange Database Administrators Q#292572, answer score: 6
Revisions (0)
No revisions yet.