patternMinor
Change default collation for character set utf8mb4 to utf8mb4_unicode_ci
Viewed 0 times
utf8mb4_unicode_cicharacterdefaultforcollationutf8mb4changeset
Problem
When creating a database without specifying a character set or collation the servers defaults are used (as expected).
However, when specifying the character set within the
I already found out that (mysql-manual)
If CHARACTER SET charset_name is specified without COLLATE, character
set charset_name and its default collation are used. To see the
default collation for each character set, use the SHOW CHARACTER SET
statement or query the INFORMATION_SCHEMA CHARACTER_SETS table.
And indeed it shows
```
MariaDB [(none)]> SHOW CHARACTER SET LIKE 'utf8mb4';
+---------+---------------+--------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+---------------+--------------------+--------+
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
+---------+---------------+------------------
MariaDB [(none)]> SHOW VARIABLES LIKE '%_server' ;
+----------------------+--------------------+
| Variable_name | Value |
+----------------------+--------------------+
| character_set_server | utf8mb4 |
| collation_server | utf8mb4_unicode_ci |
+----------------------+--------------------+
MariaDB [(none)]> CREATE DATABASE `test-without-charset` ;
MariaDB [(none)]> SELECT `DEFAULT_COLLATION_NAME` FROM `information_schema`.`SCHEMATA` WHERE `SCHEMA_NAME` LIKE 'test-without-charset';
+------------------------+
| DEFAULT_COLLATION_NAME |
+------------------------+
| utf8mb4_unicode_ci |
+------------------------+However, when specifying the character set within the
CREATE DATABASE-query, the default collation changes to utf8mb4_general_ci.MariaDB [(none)]> CREATE DATABASE `test-with-charset` CHARACTER SET utf8mb4 ;
MariaDB [(none)]> SELECT `DEFAULT_COLLATION_NAME` FROM `information_schema`.`SCHEMATA` WHERE `SCHEMA_NAME` LIKE 'test-with-charset';
+------------------------+
| DEFAULT_COLLATION_NAME |
+------------------------+
| utf8mb4_general_ci |
+------------------------+I already found out that (mysql-manual)
If CHARACTER SET charset_name is specified without COLLATE, character
set charset_name and its default collation are used. To see the
default collation for each character set, use the SHOW CHARACTER SET
statement or query the INFORMATION_SCHEMA CHARACTER_SETS table.
And indeed it shows
utf8mb4_general_ci, so it is following the rules```
MariaDB [(none)]> SHOW CHARACTER SET LIKE 'utf8mb4';
+---------+---------------+--------------------+--------+
| Charset | Description | Default collation | Maxlen |
+---------+---------------+--------------------+--------+
| utf8mb4 | UTF-8 Unicode | utf8mb4_general_ci | 4 |
+---------+---------------+------------------
Solution
I don't think there is a way to change that
Anyway, it would be better to use
Just get into the habit of specifying
By explicitly specifying the charset and collation, you maintain control and consistency, even if it is an out-dated pair.
A compromise...
But charset and collation on
DEFAULT.Anyway, it would be better to use
utf8mb4_unicode_520_ci, which is based on a later Unicode standard.Just get into the habit of specifying
CHARACTER SET and COLLATION on all connections and CREATE TABLEs. MySQL and MariaDB are gradually changing from latin1_swedish_ci to utf8mb4_0900_ai_ci. MariaDB is not there yet, but I expect them to move soon. And "900" is probably not the last Unicode standard.By explicitly specifying the charset and collation, you maintain control and consistency, even if it is an out-dated pair.
A compromise...
But charset and collation on
CREATE DATABASE. Then any tables built without specific settings will inherit those settings. And columns within that table will inherit from the table's settings.Context
StackExchange Database Administrators Q#239975, answer score: 6
Revisions (0)
No revisions yet.