snippetsqlMinor
How to create Postgres DB with case insensitive collation
Viewed 0 times
casepostgrescreatewithinsensitivehowcollation
Problem
I am using the postgres version 9.4.1 64-bit on windows 7 64-bit.
I need to create DB with the setting "case sensitive = OFF" but couldn't make this work. I refereed already many question on SO and other forums as well but even after trying out all those options my DB is still case sensitive and my search queries returns only partial results.
Does any one have success getting this work on windows environment?
Note: I am aware of the ILIKE operator but that option is not going to be considered by Architect team as we have a clear requirements that Database storage should be CASE INSENSITIVE by default. We do ot have any scenarios where we will need case sensitive search.
I need to create DB with the setting "case sensitive = OFF" but couldn't make this work. I refereed already many question on SO and other forums as well but even after trying out all those options my DB is still case sensitive and my search queries returns only partial results.
Does any one have success getting this work on windows environment?
Note: I am aware of the ILIKE operator but that option is not going to be considered by Architect team as we have a clear requirements that Database storage should be CASE INSENSITIVE by default. We do ot have any scenarios where we will need case sensitive search.
Solution
This may be too late for the original poster, but for completeness, the way to achieve case insensitive behaviour from PostgreSQL is to set a non-deterministic collation. This is only for Postgres 12.
Details are described in docs here. Reproducing relevant portion for completeness:
A collation is either deterministic or nondeterministic. A
deterministic collation uses deterministic comparisons, which means
that it considers strings to be equal only if they consist of the same
byte sequence. Nondeterministic comparison may determine strings to be
equal even if they consist of different bytes. Typical situations
include case-insensitive comparison, accent-insensitive comparison, as
well as comparison of strings in different Unicode normal forms. It is
up to the collation provider to actually implement such insensitive
comparisons; the deterministic flag only determines whether ties are
to be broken using bytewise comparison. See also Unicode Technical
Standard 10 for more information on the terminology.
To create a nondeterministic collation, specify the property
deterministic = false to CREATE COLLATION, for example:
This example would use the standard Unicode collation in a
nondeterministic way. In particular, this would allow strings in
different normal forms to be compared correctly. More interesting
examples make use of the ICU customization facilities explained above.
For example:
All standard and predefined collations are deterministic, all
user-defined collations are deterministic by default. While
nondeterministic collations give a more “correct” behavior, especially
when considering the full power of Unicode and its many special cases,
they also have some drawbacks. Foremost, their use leads to a
performance penalty. Also, certain operations are not possible with
nondeterministic collations, such as pattern matching operations.
Therefore, they should be used only in cases where they are
specifically wanted.
Details are described in docs here. Reproducing relevant portion for completeness:
A collation is either deterministic or nondeterministic. A
deterministic collation uses deterministic comparisons, which means
that it considers strings to be equal only if they consist of the same
byte sequence. Nondeterministic comparison may determine strings to be
equal even if they consist of different bytes. Typical situations
include case-insensitive comparison, accent-insensitive comparison, as
well as comparison of strings in different Unicode normal forms. It is
up to the collation provider to actually implement such insensitive
comparisons; the deterministic flag only determines whether ties are
to be broken using bytewise comparison. See also Unicode Technical
Standard 10 for more information on the terminology.
To create a nondeterministic collation, specify the property
deterministic = false to CREATE COLLATION, for example:
CREATE COLLATION ndcoll (provider = icu, locale = 'und', deterministic = false);This example would use the standard Unicode collation in a
nondeterministic way. In particular, this would allow strings in
different normal forms to be compared correctly. More interesting
examples make use of the ICU customization facilities explained above.
For example:
CREATE COLLATION case_insensitive (provider = icu, locale = 'und-u-ks-level2', deterministic = false);
CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-true', deterministic = false);All standard and predefined collations are deterministic, all
user-defined collations are deterministic by default. While
nondeterministic collations give a more “correct” behavior, especially
when considering the full power of Unicode and its many special cases,
they also have some drawbacks. Foremost, their use leads to a
performance penalty. Also, certain operations are not possible with
nondeterministic collations, such as pattern matching operations.
Therefore, they should be used only in cases where they are
specifically wanted.
Code Snippets
CREATE COLLATION ndcoll (provider = icu, locale = 'und', deterministic = false);CREATE COLLATION case_insensitive (provider = icu, locale = 'und-u-ks-level2', deterministic = false);
CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-true', deterministic = false);Context
StackExchange Database Administrators Q#101294, answer score: 6
Revisions (0)
No revisions yet.