patternsqlModerate
Which collation should I choose for a muiti-language website?
Viewed 0 times
websitechooselanguagemuitiforshouldwhichcollation
Problem
Does a collation have any influence over a query speed? Does the size of a table change depending of the collation?
If I want to build a website that must support all possible languages (lets take for e.g. Google) which would be the recommended collation?
I will need to store characters such as
How do I know which is the best choice to make? Which collation better suits this case?
If I want to build a website that must support all possible languages (lets take for e.g. Google) which would be the recommended collation?
I will need to store characters such as
日本語, my searches over the website will have to return something for the sóméthíng input, it must be case insensitive as well.How do I know which is the best choice to make? Which collation better suits this case?
Solution
Generally speaking, one of the Unicode variants is probably the best for broad language support - UTF-8 is going to use less memory per codepoint, and thus will have a slight advantage in any time/space tradeoffs you find yourself in need of making; however, I think there are some of the more esoteric languages/scripts that UTF-8 cannot represent (but I'm not 100% certain of that, I haven't done an exhaustive study on the matter).
This Wikipedia article might be enlightening on the dis/advantages of each.
This Wikipedia article might be enlightening on the dis/advantages of each.
Context
StackExchange Database Administrators Q#255, answer score: 16
Revisions (0)
No revisions yet.