HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Build JSON object from one-to-many relationship data in a single query?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fromqueryonesinglemanyjsonobjectdatabuildrelationship

Problem

I have a PostgreSQL 9.5.3 DB with tables like this:

container
    id: uuid (pk)
    ... other data

thing
    id: uuid (pk)
    ... other data

container_thing
    container_id: uuid (fk)
    thing_id: uuid (fk)
    primary key (container_id, thing_id)


A container can point to any number of things (without duplicates), and a thing can be pointed to by any number of containers.

There may be a large number of containers and things (depends on how many customers we have). Each container would likely only have 1 to 10 things in it. We'll only query for around 20 containers at once, max. A container can be empty, and I need to get an empty array back.

I need to build json objects that represent containers, something like this:

{
    "id": "d7e1bc6b-b659-432d-b346-29f3a530bfa9",
    ... other data
    "thingIds": [
        "4e3ad81b-f2b5-4220-8e0e-e9d53c80a214",
        "f26f49e5-76b4-4363-9ffe-9654ba0b0f0d"
    ]
}


This is working fine, but I'm doing it by using two queries:

select * from "container" where "id" in ()
select * from "container_thing" where "container_id" in ()


I then procedurally build the "thingIds" array for each container.

Later I found a solution with correlated subqueries that works nicely for me at the moment.

select *, array(select thing_id from container_thing where container_id = c.id) as "thingIds"
from container c;


Is there a better way to do this, perhaps using a join somehow?

It seems like they always generate a single set of rows, which would mean duplicating the container data for every thing that's being pointed at.

Solution

If containers can be empty, the currently accepted solution does not work for you. It has to be an outer join to preserve rows without match - to get equivalent results to the correlated subqueries you are using in your fiddle:

select *, array(select thing_id from container_thing where container_id = container.id) as "thingIds"
from container


  1. Simple query



SELECT to_json(sub.*) AS container_with_things
FROM  (
   SELECT c.*, json_agg(thing_id) AS "thingIds"
   FROM   container c
   LEFT   JOIN container_thing ct ON  ct.container_id = c.id
   WHERE  c.id IN ()
   GROUP  BY c.id
   ) sub;


  1. Aggregate before join



With more than a few rows per container (you mentioned 20) it's typically faster to aggregate before you join:

SELECT to_json(sub.*) AS container_with_things
FROM  (
   SELECT c.*, ct."thingIds"
   FROM   container c
   LEFT   JOIN (
      SELECT container_id AS id, json_agg(thing_id) AS "thingIds"
      FROM   container_thing
      WHERE  container_id IN () -- repeat condition
      GROUP  BY 1
      ) ct USING (id)
   WHERE  c.id IN ()
   ) sub;


  1. LEFT JOIN LATERAL with ARRAY constructor



Combine an ARRAY constructor with LEFT JOIN LATERAL:

SELECT to_json(sub.*) AS container_with_things
FROM  (
   SELECT c.*, ct."thingIds"
   FROM   container c
   LEFT   JOIN LATERAL (
      SELECT ARRAY (
         SELECT thing_id
         FROM   container_thing
         WHERE  container_id = c.id
         -- ORDER  BY thing_id  -- optional order for deterministic results
         ) AS "thingIds"
      ) ct ON true
   WHERE  c.id IN ()
   ) sub;


Might be faster, yet. See:

  • What is the difference between LATERAL and a subquery in PostgreSQL?



The result for empty containers is subtly different in the three queries above:

  • "thingIds":[null]



  • "thingIds":null



  • "thingIds":[]



  1. jsonb_set() in Postgres 9.5



In Postgres 9.5 or later you can also work with jsonb and jsonb_set() - and one less subquery:

SELECT jsonb_set(to_jsonb(c.*), '{thingIds}', COALESCE(ct."thingIds", '[]')) AS container_with_things
FROM   container c
LEFT   JOIN (
   SELECT container_id AS id, jsonb_agg(thing_id) AS "thingIds"
   FROM   container_thing
   WHERE  container_id IN () -- repeat condition
   GROUP  BY 1
   ) ct USING (id)
WHERE  c.id IN ();


Alternatively:

SELECT to_jsonb(c) || jsonb_build_object('thingIds', "thingIds") AS container_with_things
FROM   ...


Either way, the whole container_with_things is set to NULL if "thingIds" is NULL, so throw in COALESCE to prevent that.
  1. jsonb_set_lax() in Postgres 13



jsonb_set_lax() has ways to deal with NULL:

SELECT jsonb_set_lax(to_jsonb(c.*), '{thingIds}', ct."thingIds") AS container_with_things
FROM   container c
LEFT   JOIN (
   SELECT ct.container_id AS id, jsonb_agg(ct.thing_id) AS "thingIds"
   FROM   container_thing ct
   WHERE  ct.container_id IN ('1','3','4')  -- repeat condition
   GROUP  BY 1
   ) ct USING (id)
WHERE  c.id IN ('1','3','4');


By default a JSON NULL is used as value, but there are other options. Read the manual.

db<>fiddle here

Old sqlfiddle

Code Snippets

select *, array(select thing_id from container_thing where container_id = container.id) as "thingIds"
from container
SELECT to_json(sub.*) AS container_with_things
FROM  (
   SELECT c.*, json_agg(thing_id) AS "thingIds"
   FROM   container c
   LEFT   JOIN container_thing ct ON  ct.container_id = c.id
   WHERE  c.id IN (<list of container ids>)
   GROUP  BY c.id
   ) sub;
SELECT to_json(sub.*) AS container_with_things
FROM  (
   SELECT c.*, ct."thingIds"
   FROM   container c
   LEFT   JOIN (
      SELECT container_id AS id, json_agg(thing_id) AS "thingIds"
      FROM   container_thing
      WHERE  container_id IN (<list of container ids>) -- repeat condition
      GROUP  BY 1
      ) ct USING (id)
   WHERE  c.id IN (<list of container ids>)
   ) sub;
SELECT to_json(sub.*) AS container_with_things
FROM  (
   SELECT c.*, ct."thingIds"
   FROM   container c
   LEFT   JOIN LATERAL (
      SELECT ARRAY (
         SELECT thing_id
         FROM   container_thing
         WHERE  container_id = c.id
         -- ORDER  BY thing_id  -- optional order for deterministic results
         ) AS "thingIds"
      ) ct ON true
   WHERE  c.id IN (<list of container ids>)
   ) sub;
SELECT jsonb_set(to_jsonb(c.*), '{thingIds}', COALESCE(ct."thingIds", '[]')) AS container_with_things
FROM   container c
LEFT   JOIN (
   SELECT container_id AS id, jsonb_agg(thing_id) AS "thingIds"
   FROM   container_thing
   WHERE  container_id IN (<list of container ids>) -- repeat condition
   GROUP  BY 1
   ) ct USING (id)
WHERE  c.id IN (<list of container ids>);

Context

StackExchange Database Administrators Q#144102, answer score: 19

Revisions (0)

No revisions yet.