patternsqlMinor
Conditional INSERT with a nested CTE?
Viewed 0 times
insertwithcteconditionalnested
Problem
I am trying to figure out if there is a way I can make nested CTEs work for this particular case.
Consider the following (highly contrived) scenario which is based on the actual application: there's a single column table of employee id's. Then there's an employee property table with all the details. (The main reason behind the single col table is often as a matter of course new employee id's need to be created in lots and assigned before any details of the actual staff is known.)
Now to the task at hand, we are inserting details (i.e. the name) for a new employee, but first we need to check if an employee with that name already exists. If it does, we'll simply return the id, and if not, we'll create a new employee record and then insert the details, finally returning the newly created id.
To recreate this test scenario:
The query I am trying to hammer into shape looks like shown below.
If I replace the nested CTE with an already created id (executing the nested CTE separately), it works (but can result in the creation of a superfluous id). It is also possible to simply move the nested CTE to the top level (so the whole thing will look like
Consider the following (highly contrived) scenario which is based on the actual application: there's a single column table of employee id's. Then there's an employee property table with all the details. (The main reason behind the single col table is often as a matter of course new employee id's need to be created in lots and assigned before any details of the actual staff is known.)
Now to the task at hand, we are inserting details (i.e. the name) for a new employee, but first we need to check if an employee with that name already exists. If it does, we'll simply return the id, and if not, we'll create a new employee record and then insert the details, finally returning the newly created id.
To recreate this test scenario:
CREATE TABLE public.employee (
id text DEFAULT gen_random_uuid(),
PRIMARY KEY (id)
);
CREATE TABLE public.employee_details (
employee_id text,
name text,
PRIMARY KEY (employee_id),
FOREIGN KEY (employee_id) REFERENCES public.employee(id)
);The query I am trying to hammer into shape looks like shown below.
with
e as
(select name, employee_id from employee_details where name = 'jack bauer'),
i as (insert into employee_details (name, employee_id)
select 'jack bauer',
(with a as (insert into employee values(default) RETURNING id) select a.id from a)
where not exists (select 1 from e) returning name, employee_id)
select employee_id, name from e
union all
select employee_id, name from i;If I replace the nested CTE with an already created id (executing the nested CTE separately), it works (but can result in the creation of a superfluous id). It is also possible to simply move the nested CTE to the top level (so the whole thing will look like
with e as (..), i as (..), a as (..) select .. where not exists..., but this would also mean that a superfluous employee id is createSolution
For the purpose of this question, I'll assume
You cannot nest a data-modifying CTE like you tried (as you already found out the hard way) - and you don't need to. This query would achieve your objective:
The core feature is the
The manual:
If no list of [target] column names is given at all, the default is all the
columns of the table in their declared order; [...]
Each column not present in the explicit or implicit column list will
be filled with a default value, either its declared default value or
null if there is none.
This is a Postgres specific extension of the standard:
Also, the case in which a column name list is omitted, but not all the
columns are filled from the
The final CTE
This is subject to race conditions under concurrent write load to the same tables. If you need to rule that out, you need to do more. Related:
Without the complications from the conditional INSERT in the 2nd table, this would boil down to a common case of SELECT or INSERT:
Aside
I'd strongly advise to use the data type
employee_details.name to be defined UNIQUE. Else, the whole operation wouldn't make sense.You cannot nest a data-modifying CTE like you tried (as you already found out the hard way) - and you don't need to. This query would achieve your objective:
WITH e AS (
SELECT name, employee_id
FROM employee_details
WHERE name = 'jack bauer'
)
, i1 AS (
INSERT INTO employee -- no target columns!
SELECT -- empty SELECT list!
WHERE NOT EXISTS (SELECT FROM e)
RETURNING id
)
, i2 AS (
INSERT INTO employee_details (name, employee_id)
SELECT 'jack bauer', id
FROM i1
RETURNING name, employee_id
)
SELECT employee_id, name FROM e
UNION ALL
SELECT employee_id, name FROM i2;The core feature is the
INSERT with no target columns and an empty SELECT. Postgres fills all columns not listed in the SELECT with default values. This way we can replace the unconditional VALUES (default) with a conditional INSERT. The CTE i1 only inserts a row if the given name was not found.The manual:
If no list of [target] column names is given at all, the default is all the
columns of the table in their declared order; [...]
Each column not present in the explicit or implicit column list will
be filled with a default value, either its declared default value or
null if there is none.
This is a Postgres specific extension of the standard:
Also, the case in which a column name list is omitted, but not all the
columns are filled from the
VALUES clause or query, is disallowed by the standard.The final CTE
i2 only inserts a row if i1 returned a row. Voilá.This is subject to race conditions under concurrent write load to the same tables. If you need to rule that out, you need to do more. Related:
- How to use RETURNING with ON CONFLICT in PostgreSQL?
Without the complications from the conditional INSERT in the 2nd table, this would boil down to a common case of SELECT or INSERT:
- Is SELECT or INSERT in a function prone to race conditions?
Aside
"id" text DEFAULT gen_random_uuid()I'd strongly advise to use the data type
uuid to store UUIDs.- Would index lookup be noticeably faster with char vs varchar when all values are 36 chars
- What is the optimal data type for an MD5 field?
Code Snippets
WITH e AS (
SELECT name, employee_id
FROM employee_details
WHERE name = 'jack bauer'
)
, i1 AS (
INSERT INTO employee -- no target columns!
SELECT -- empty SELECT list!
WHERE NOT EXISTS (SELECT FROM e)
RETURNING id
)
, i2 AS (
INSERT INTO employee_details (name, employee_id)
SELECT 'jack bauer', id
FROM i1
RETURNING name, employee_id
)
SELECT employee_id, name FROM e
UNION ALL
SELECT employee_id, name FROM i2;"id" text DEFAULT gen_random_uuid()Context
StackExchange Database Administrators Q#203258, answer score: 9
Revisions (0)
No revisions yet.