snippetsqlMinor
How can this unique index allow duplicate rows?
Viewed 0 times
thisuniquecanrowsduplicateallowhowindex
Problem
Is there a way that this unique index has allowed duplicate rows? I thought maybe there are some extra space characters, but I can't find them.
The index is
PostgreSQL 9.5.3
OK I was planning to remove the other two columns, but I thought I would test the
=> select *, length(keyword), length(country), length(language) from keyword where id in (4588076, 4951423);
id | keyword | seed_id | source | search_count | country | language | volume | cpc | competition | modified_on | violation | revenue | length | length | length
---------+---------------------+---------+--------+--------------+---------+----------+--------+------+-------------+-------------+-----------+---------+--------+--------+--------
4588076 | power wallet review | | SPYFU | 0 | | | 70 | 0.11 | 0.31 | | | | 19 | |
4951423 | power wallet review | | SPYFU | 2 | | | 70 | 0.11 | 0.31 | | | | 19 | |
(2 rows)The index is
"keyword_keyword_country_language" UNIQUE, btree (keyword, country, language)PostgreSQL 9.5.3
OK I was planning to remove the other two columns, but I thought I would test the
keyword column and found this:=> select k1.id, k1.keyword, k2.id, k2.keyword, k1.keyword=k2.keyword from keyword k1, keyword k2 where k1.id=4588076 and k2.id=4951423;
id | keyword | id | keyword | ?column?
---------+---------------------+---------+---------------------+----------
4588076 | power wallet review | 4951423 | power wallet review | fSolution
The answer can be found in the amazing documentation .. it LOOKS like you have NULL values in your tables.. When the DB checks for uniqueness, it says "does NULL equal NULL? NOPE!" and allows it.
The important bit below (emphasis mine):
are not allowed. Null values are not considered equal. A multicolumn unique
If you want to keep the uniqueness across ALL THREE COLUMNS and, at the same time, treat nulls as equal, then you have to get creative with your UNIQUE indexes by making them partial indexes..
As you can see .. it can easily get a bit crazy.
A different alternative would be to make col2wNull and col3wNull be defined as NOT NULL and provide some default value for when nothing is supplied. THIS MAY OR MAY NOT BE A GOOD IDEA depending on what you're doing. "Magic values" have a tendency to give you lots of problems later.
With regard to your edit and the two strings appearing to be equal, but the database reporting that they are not - I can only imagine that there are some "invisible" characters (UTF-8?) that are in the string. Or it could be something as simple as one string has an additional space on the end. It partially depends on how you are saving it into the database. (Are you performing a trim() on them, lower(), etc..)
You could try comparing the strings in various other ways (such as looking at md5 hash). I believe you can ask postgres to convert the column value into hex to view, as well, but how to go about doing that is escaping me at the moment (my apologies).
The important bit below (emphasis mine):
are not allowed. Null values are not considered equal. A multicolumn unique
If you want to keep the uniqueness across ALL THREE COLUMNS and, at the same time, treat nulls as equal, then you have to get creative with your UNIQUE indexes by making them partial indexes..
CREATE UNIQUE INDEX ix1 ON table (col1, col2wNull, col3wNull) WHERE col2wNull is not NULL and col3wNull is not Null;
CREATE UNIQUE INDEX ix2 ON table (col1, col2wNull) WHERE col2wNull is not Null and col3wNull is Null;
CREATE UNIQUE INDEX ix3 ON table (col1, col3wNull) WHERE col2wNull is Null and col3wNull is not Null;
CREATE UNIQUE INDEX ix4 ON table (col1) WHERE col2wNull is NULL and col3wNull is NULL;As you can see .. it can easily get a bit crazy.
A different alternative would be to make col2wNull and col3wNull be defined as NOT NULL and provide some default value for when nothing is supplied. THIS MAY OR MAY NOT BE A GOOD IDEA depending on what you're doing. "Magic values" have a tendency to give you lots of problems later.
With regard to your edit and the two strings appearing to be equal, but the database reporting that they are not - I can only imagine that there are some "invisible" characters (UTF-8?) that are in the string. Or it could be something as simple as one string has an additional space on the end. It partially depends on how you are saving it into the database. (Are you performing a trim() on them, lower(), etc..)
You could try comparing the strings in various other ways (such as looking at md5 hash). I believe you can ask postgres to convert the column value into hex to view, as well, but how to go about doing that is escaping me at the moment (my apologies).
Code Snippets
CREATE UNIQUE INDEX ix1 ON table (col1, col2wNull, col3wNull) WHERE col2wNull is not NULL and col3wNull is not Null;
CREATE UNIQUE INDEX ix2 ON table (col1, col2wNull) WHERE col2wNull is not Null and col3wNull is Null;
CREATE UNIQUE INDEX ix3 ON table (col1, col3wNull) WHERE col2wNull is Null and col3wNull is not Null;
CREATE UNIQUE INDEX ix4 ON table (col1) WHERE col2wNull is NULL and col3wNull is NULL;Context
StackExchange Database Administrators Q#147645, answer score: 9
Revisions (0)
No revisions yet.