HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlMinor

MariaDB 5.5 vs MySQL 5.7 query optimizers

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
mysqlquerymariadboptimizers

Problem

I have a query that optimizes differently in MariaDB 5.5.37 and
MySQL 5.0 and MySQL 5.7. When poorly optimized it takes about 400
seconds to run, but on MySQL 5.7 which optimizes it differently,
it runs in about 3 seconds.

I've tried a variety of different versions of MySQL and MariaDB, and
just discovered that MySQL 5.7 optimizes it differently from the others
I have tried.

The query is generated by the Webtrees geneology software.

I'm wondering if anyone here can tease out what causes the different
optimizations and whether it is possible to reorder the query to
force a better query plan.

Here is the table definition:

CREATE TABLE `wt_name` (
  `n_file` int(11) NOT NULL,
  `n_id` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
  `n_num` int(11) NOT NULL,
  `n_type` varchar(15) COLLATE utf8_unicode_ci NOT NULL,
  `n_sort` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `n_full` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
  `n_surname` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `n_surn` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `n_givn` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `n_soundex_givn_std` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `n_soundex_surn_std` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `n_soundex_givn_dm` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  `n_soundex_surn_dm` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
  PRIMARY KEY (`n_id`,`n_file`,`n_num`),
  KEY `ix1` (`n_full`,`n_id`,`n_file`),
  KEY `ix2` (`n_surn`,`n_file`,`n_type`,`n_id`),
  KEY `ix3` (`n_givn`,`n_file`,`n_type`,`n_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci


The table has 186,123 rows. Identical data loaded in both databases.

Here is the version info for tha MariaDB 5.5 and MySQL 5.7 servers:

```
MariaDB> SHOW VARIABLES LIKE "%version%";
+-------------------------+---------------------+
| Variable_name | Value |
+-------------------------+------------

Solution

First, I would try an index on (n_file, n_surn, n_type, n_id, n_surname) for this query. You may have to change the GROUP BY clause for the index to be used effectively by all versions of the optimizer:

GROUP BY n_file, n_surn COLLATE 'utf8_unicode_ci'


Second, why the derived table, the group by and the join? I think the query is equivalent to the much simpler:

SELECT SQL_CACHE 
    n2.n_surn, n1.n_surname, n1.n_id
FROM wt_name n1 
WHERE n_file = 4 
  AND n_surn NOT IN ('', '@N.N.')
  AND n_type <> '_MARNM' ;

Code Snippets

GROUP BY n_file, n_surn COLLATE 'utf8_unicode_ci'
SELECT SQL_CACHE 
    n2.n_surn, n1.n_surname, n1.n_id
FROM wt_name n1 
WHERE n_file = 4 
  AND n_surn NOT IN ('', '@N.N.')
  AND n_type <> '_MARNM' ;

Context

StackExchange Database Administrators Q#69404, answer score: 2

Revisions (0)

No revisions yet.