patternsqlMinor
Postgres : Using WITH .... AS insert into to return non-inserted column
Viewed 0 times
insertpostgresnonwithintoreturncolumninsertedusing
Problem
I'm creating an import job which uploads to a temporary table and then splits the data into two separate tables, joined by a key.
What I'm doing is more complicated than the following but this simple example demonstrates what I want to do.
So let's say I have a temporary table temp_import defined by :
I have two target tables defined by :
Currently I have some plpython which inserts a row at a time into users returning userid, followed by a second insert which uses the returned userid to populate emails.
This works OK but I want to extract better performance by doing set based operations.
I want to create a temporary mapping table :
and try to do something like :
The issue here is that I cannot return row_id as I am not selecting it for insert into the target table
If I could do this I could populate the emails table in one go:
Anyone know how I can return the row_id in the with rows as statement above?
Thanks in advance.
What I'm doing is more complicated than the following but this simple example demonstrates what I want to do.
So let's say I have a temporary table temp_import defined by :
create temp_import(
row_id int,
name varchar(32),
email varchar(255)
)I have two target tables defined by :
create users (
userid int not null default nextval('tmp_uid_seq'::regclass),
username varchar(32)
)
create emails (
userid int,
email varchar(255)
)Currently I have some plpython which inserts a row at a time into users returning userid, followed by a second insert which uses the returned userid to populate emails.
loop:
insert into users (username) values 'Blah' returning userid
insert into emails (userid, email) values (userid, 'Blah2')
goto loopThis works OK but I want to extract better performance by doing set based operations.
I want to create a temporary mapping table :
create table mapping (
row_id int,
userid int
)and try to do something like :
with rows as (insert into users (username)
select username from temp_import returning userid, row_id)
insert into mapping (userid, row_id)The issue here is that I cannot return row_id as I am not selecting it for insert into the target table
If I could do this I could populate the emails table in one go:
insert into emails (userid,email)
select userid,email from temp_import t, mapping m
where t.row_id = m.row_idAnyone know how I can return the row_id in the with rows as statement above?
Thanks in advance.
Solution
One possible solution is adding
Use the same column definition above as you do in
When doing the
Notes:
userid to the staging table you already have:CREATE TABLE temp_import (
row_id int,
name varchar(32),
email varchar(255),
userid int not null default nextval('tmp_uid_seq'::regclass)
);Use the same column definition above as you do in
users - this way you use the same sequence, and even the nextval() will be correct after the import.When doing the
COPY, the userid will be filled. This way you have the mapping and the staging all in the same place. Now you have to copy the userid into the final table, too:INSERT INTO users (userid, name)
SELECT userid, name
FROM temp_import;
INSERT INTO mapping (userid, row_id)
SELECT userid, row_id
FROM temp_import;Notes:
- in PostgreSQL, unlimited
textandvarcharhave some performance advantage over the limited types. Use the latter if you really have a limit. Withvarchar(255)the case is usually 'We don't know how long it can be, but definitely not that long'. Email addresses are usually considered as limited to 254 characters, not 255: https://www.rfc-editor.org/errata_search.php?eid=1690 Also, are you sure that nobody can have a name longer than 32 characters, including spaces, punctuation and so on?
- it usually makes sense to use some consistent naming scheme in any software system, including databases. I'm already confused a bit about the different styles used in
useridandrow_id.
- if the import will happen repeatedly or in parallel, just use the same sequence all the time. Sequences are transaction-safe and won't reuse any values until they wrap over (which can be mostly prevented by using
bigint/bigserial).
Code Snippets
CREATE TABLE temp_import (
row_id int,
name varchar(32),
email varchar(255),
userid int not null default nextval('tmp_uid_seq'::regclass)
);INSERT INTO users (userid, name)
SELECT userid, name
FROM temp_import;
INSERT INTO mapping (userid, row_id)
SELECT userid, row_id
FROM temp_import;Context
StackExchange Database Administrators Q#115764, answer score: 2
Revisions (0)
No revisions yet.