patternsqlMinor
Insert into Table ignoring duplicate values
Viewed 0 times
insertvaluesintoduplicateignoringtable
Problem
I am having a brain fart on figuring this out. I have the following two Tables:
I have properly cleaned the data in the table so there are no null values trying to be inserted into rows which require a value. My issue however is with my
These are the tables included in the joins not listed above
```
Table: man_temp
man_id INT NOT NULL,
man_name VARCHAR(100) NOT NULL
Table: manufacturers
manufacturer_id INT IDENTITY(1,1) NOT NULL,
manufacturer_name VARCHAR(100) NOT NULL
Table: cat_temp
category_id INT NOT NULL,
category_name VARCHAR(100) NOT NULL
Table: subcategories
Table: parts
part_id INT IDENTITY(1,1) NOT NULL,
part_number VARCHAR(50) UNIQUE NOT NULL,
part_description VARCHAR(MAX) NOT NULL,
information VARCHAR(MAX) NULL,
manufacturer_id INT NOT NULL,
subcategory_id INT NOT NULL
Table: part_temp
part_num VARCHAR(50) NOT NULL,
part_desc VARCHAR(MAX) NULL,
info VARCHAR(MAX) NULL,
man_id INT NULL,
sub_id INT NULLpart_temp is my temporary table that contains data from a CSV file, that is why only one column is set to be NOT NULL. I need to insert the data from part_temp into parts.I have properly cleaned the data in the table so there are no null values trying to be inserted into rows which require a value. My issue however is with my
UNIQUE constraint for my part_number columns in the parts table. There are duplicate values within the part_temp table so I need a way to be able to skip over them during my insert. This is what I have tried so far, but it does not work:INSERT INTO parts
SELECT DISTINCT pt.part_num, pt.part_desc, pt.info, m.manufacturer_id, s.subcategory_id
FROM part_temp AS pt
FULL OUTER JOIN man_temp AS mt ON pt.man_id = mt.man_id
INNER JOIN manufacturers AS m ON mt.man_name = m.manufacturer_name
FULL OUTER JOIN cat_temp AS ct ON pt.sub_id = ct.category_id
INNER JOIN subcategories AS s ON ct.category_name = s.subcategory_name
WHERE NOT EXISTS(SELECT part_number FROM parts WHERE part_number = pt.part_num)These are the tables included in the joins not listed above
```
Table: man_temp
man_id INT NOT NULL,
man_name VARCHAR(100) NOT NULL
Table: manufacturers
manufacturer_id INT IDENTITY(1,1) NOT NULL,
manufacturer_name VARCHAR(100) NOT NULL
Table: cat_temp
category_id INT NOT NULL,
category_name VARCHAR(100) NOT NULL
Table: subcategories
Solution
If I run into a
From answering the above you will have a good idea of where the duplicate value is sourced from and a reasonable idea of the cause. Example possibilities are:
Ultimately you need to strip this problem back to basics, as you have the duplicate value in question you need to start digging into the data and pay close attention to the source query. Luckily in your example it is a simple query and you can easily strip out joins to see if any of those introduce the duplicate, or if indeed the duplicate exists due to non-key attributes being different therefore causing your
I guess what I am trying to say is that there isn't necessarily a quick win for this kind of problem and you need to do some good ol' fashion detective work using the clues provided by SQL.
Moot point, but I like to explicitly list the columns on an insert so that nothing breaks if the order of columns on the target table is different to that of the source query.
Violation of UNIQUE KEY constraint the first thing I do is ask myself the following questions:- Does the duplicate key already exist in the target table?
- Is the key duplicated in any source table?
From answering the above you will have a good idea of where the duplicate value is sourced from and a reasonable idea of the cause. Example possibilities are:
- Duplicate exists in source table
- Duplicate does not exist at source, but a transform of the data causes duplicates to be returned
Ultimately you need to strip this problem back to basics, as you have the duplicate value in question you need to start digging into the data and pay close attention to the source query. Luckily in your example it is a simple query and you can easily strip out joins to see if any of those introduce the duplicate, or if indeed the duplicate exists due to non-key attributes being different therefore causing your
DISTINCT to not do what you expect it to do.I guess what I am trying to say is that there isn't necessarily a quick win for this kind of problem and you need to do some good ol' fashion detective work using the clues provided by SQL.
Moot point, but I like to explicitly list the columns on an insert so that nothing breaks if the order of columns on the target table is different to that of the source query.
Context
StackExchange Database Administrators Q#215115, answer score: 2
Revisions (0)
No revisions yet.