patternsqlMinor
What's the differences between utf8_general_ci and utf8_unicode_ci and utf8_binary collation in MySQL?
Viewed 0 times
utf8_unicode_citheutf8_general_ciutf8_binarywhatdifferencesmysqlbetweencollationand
Problem
I'm can't find the documentation of MySQL on this topic. Anyone can give some explanations please?
Solution
According to the MySQL Documentation
Collations have these general characteristics:
Two different character sets cannot have the same collation.
Each character set has one collation that is the default collation.
For example, the default collation for latin1 is latin1_swedish_ci.
The output for SHOW CHARACTER SET indicates which collation is the
default for each displayed character set.
There is a convention for collation names: They start with the name of
the character set with which they are associated, they usually include
a language name, and they end with _ci (case insensitive), _cs (case
sensitive), or _bin (binary).
In cases where a character set has multiple collations, it might not
be clear which collation is most suitable for a given application. To
avoid choosing the wrong collation, it can be helpful to perform some
comparisons with representative data values to make sure that a given
collation sorts values the way you expect.
StackOverflow has a list of questions tagged utf-8 and collation
ServerFault only has one tagged utf-8 and collation
There is a website called efreedom.com that has links all around StackOverflow concerning utf8 : http://efreedom.com/Question/1-4784168/Change-Collation-Utf8-Bin-One-Go
Here is another site about collations as its place in the MySQL World : http://www.collation-charts.org/
Here is a link explaining binary collations : http://dev.mysql.com/doc/refman/5.0/en/charset-binary-collations.html
Collations have these general characteristics:
Two different character sets cannot have the same collation.
Each character set has one collation that is the default collation.
For example, the default collation for latin1 is latin1_swedish_ci.
The output for SHOW CHARACTER SET indicates which collation is the
default for each displayed character set.
There is a convention for collation names: They start with the name of
the character set with which they are associated, they usually include
a language name, and they end with _ci (case insensitive), _cs (case
sensitive), or _bin (binary).
In cases where a character set has multiple collations, it might not
be clear which collation is most suitable for a given application. To
avoid choosing the wrong collation, it can be helpful to perform some
comparisons with representative data values to make sure that a given
collation sorts values the way you expect.
StackOverflow has a list of questions tagged utf-8 and collation
ServerFault only has one tagged utf-8 and collation
There is a website called efreedom.com that has links all around StackOverflow concerning utf8 : http://efreedom.com/Question/1-4784168/Change-Collation-Utf8-Bin-One-Go
Here is another site about collations as its place in the MySQL World : http://www.collation-charts.org/
Here is a link explaining binary collations : http://dev.mysql.com/doc/refman/5.0/en/charset-binary-collations.html
Context
StackExchange Database Administrators Q#8006, answer score: 4
Revisions (0)
No revisions yet.