patternsqlMinor
Postgres: Insert distinct values from one table into another table with constrains in place
Viewed 0 times
distinctinsertpostgreswithintoplaceoneconstrainsanothervalues
Problem
I'm trying to insert data from one PostgreSQL 9.3 table to another while removing duplicate entries.
I have two tables containing email addresses.
The main table contains emails and a tag for each email address. The combination
The second table get created on the fly from text files which contain only the email address. There are many duplicates there.
I need to import the data from the temp table into the main one without breaking the constraint above. For a certain file containing email addresses the tag is constant.
Table structures:
and
Here is my query:
Note: 655 is just a tag of a certain group of email addresses.
This is the error I get:
ERROR: duplicate key value violates unique constraint
"unique-tag-email" SQL state: 23505 Detail: Key (email,
tag)=(user@hotmail.com, 655) already exists.
There are indeed, two email addresses user@hotmail.com in the file.
Useless to say, because of this error, there is nothing added in the main table (email).
What am I doing wrong?
I have two tables containing email addresses.
The main table contains emails and a tag for each email address. The combination
(email, tag) must be unique, and for this there is a constraint Unique(email, tag) on it.The second table get created on the fly from text files which contain only the email address. There are many duplicates there.
I need to import the data from the temp table into the main one without breaking the constraint above. For a certain file containing email addresses the tag is constant.
Table structures:
CREATE TABLE emails (
email character varying(128),
tag bigint,
CONSTRAINT "unique-tag-email" UNIQUE (email, tag) )and
CREATE TABLE emails_temp (email character varying(128)Here is my query:
insert into emails(tag,email)
select
655,t.email
from
emails_temp as t
where
not exists ( select email from emails where email = t.email )Note: 655 is just a tag of a certain group of email addresses.
This is the error I get:
ERROR: duplicate key value violates unique constraint
"unique-tag-email" SQL state: 23505 Detail: Key (email,
tag)=(user@hotmail.com, 655) already exists.
There are indeed, two email addresses user@hotmail.com in the file.
Useless to say, because of this error, there is nothing added in the main table (email).
What am I doing wrong?
Solution
There are 3 possible kinds of duplicates:
-
Duplicates within the rows of the bulk insert.
-
Duplicates between inserted rows and existing rows.
-
Duplicates between inserted rows and concurrently inserted / updated rows from other transactions.
Just like I explained in this closely related answer:
But things have become easier for 2. and 3. since Postgres 9.5 introduced UPSERT (
If your duplicates only stem from duplicate entries in the source (1.), like you indicated, then all you need is
-
Duplicates within the rows of the bulk insert.
-
Duplicates between inserted rows and existing rows.
-
Duplicates between inserted rows and concurrently inserted / updated rows from other transactions.
Just like I explained in this closely related answer:
- Using EXCEPTION to ignore duplicates during bulk inserts
But things have become easier for 2. and 3. since Postgres 9.5 introduced UPSERT (
INSERT .. ON CONFLICT DO NOTHING).INSERT INTO emails(tag,email)
SELECT DISTINCT 655, email
FROM emails_temp
ON CONFLICT (email) DO NOTHING;If your duplicates only stem from duplicate entries in the source (1.), like you indicated, then all you need is
DISTINCT. Works in any version of Postgres:INSERT INTO emails(tag,email)
SELECT DISTINCT 655, email
FROM emails_temp;Code Snippets
INSERT INTO emails(tag,email)
SELECT DISTINCT 655, email
FROM emails_temp
ON CONFLICT (email) DO NOTHING;INSERT INTO emails(tag,email)
SELECT DISTINCT 655, email
FROM emails_temp;Context
StackExchange Database Administrators Q#160469, answer score: 5
Revisions (0)
No revisions yet.