HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Mysql fulltext search my.cnf optimization

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fulltextsearchcnfoptimizationmysql

Problem

I have open a question on https://serverfault.com/questions/353888/mysql-full-text-search-cause-high-usage-cpu Some user recommended asking here.

We built a news site. Every day we will input tens of thousands data from web api.

In order to provide a precision search service, our table uses MyISAM, building a fulltext index (title, content, date). Our site is in testing on Godaddy VDS with 2GB RAM, 30GB space (No swap, because VDS do not allow to build swap). The CPU is Intel(R) Xeon(R) CPU L5609 @ 1.87GHz

After running a ./mysqltuner.pl

We get some results:

```
-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.5.20
[OK] Operating on 32-bit architecture with less than 2GB RAM

-------- Storage Engine Statistics -------------------------------------------
[--] Status: -Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 396M (Tables: 39)
[--] Data in InnoDB tables: 208K (Tables: 8)
[!!] Total fragmented tables: 9

-------- Security Recommendations -------------------------------------------
[!!] User '@ip-XX-XX-XX-XX.ip.secureserver.net'
[!!] User '@localhost'

-------- Performance Metrics -------------------------------------------------
[--] Up for: 17h 27m 58s (1M q [20.253 qps], 31K conn, TX: 513M, RX: 303M)
[--] Reads / Writes: 61% / 39%
[--] Total buffers: 168.0M global + 2.7M per thread (151 max threads)
[OK] Maximum possible memory usage: 573.8M (28% of installed RAM)
[OK] Slow queries: 0% (56/1M)
[!!] Highest connection usage: 100% (152/151)
[OK] Key buffer size / total MyISAM indexes: 8.0M/162.5M
[OK] Key buffer hit rate: 100.0% (2B cached / 882K reads)
[!!] Query cache is disabled
[OK] Sorts requiring temporary tables: 0% (0 temp sorts / 17K sorts)
[!!] Temporary tables created on disk: 49% (32K on disk / 64K total)
[!!] Thread cache is disabled
[!!] Table cache hit rate: 0% (400 open / 298K ope

Solution

I have an interesting surprise for you.

The only Optimizing for FullText Indexing you can do is not something at the my.cnf level. It is all about two things:

  • The Stopword List



  • The Query



STOPWORDS

There are 543 stopwords that you may or may not want filtered out of FULLTEXT indexes. The list of stopwords was built at compile time. You can override that list with your own list as follows:

OK, now let's create our stopword list. I usually set the English articles as the only stopwords.

echo "a"    > /var/lib/mysql/stopwords.txt
echo "an"  >> /var/lib/mysql/stopwords.txt
echo "the" >> /var/lib/mysql/stopwords.txt


Next, add the option to /etc/my.cnf plus allowing 1-letter, 2-letter, and 3 letter words

[mysqld]
ft_min_word_len=1
ft_stopword_file=/var/lib/mysql/stopwords.txt


Finally, restart mysql

service mysql restart


If you have any tables with FULLTEXT indexes already in place, you must drop those FULLTEXT indexes and create them again.
QUERY

Here is a little known fact about MySQL queries using a Full Table Index: There are occasions when the MySQL Query Optimizer stops using FULLTEXT indexes altogether and perform full table scans.

Here is an example:

use test
drop table if exists ft_test;
create table ft_test
(
    id int not null auto_increment,
    txt text,
    primary key (id),
    FULLTEXT (txt)
) ENGINE=MyISAM;
insert into ft_test (txt) values
('mount camaroon'),('mount camaron'),('mount camnaroon'),
('mount cameroon'),('mount cemeroon'),('mount camnaroon'),
('mount camraon'),('mount camaraon'),('mount camaran'),
('mount camnaraon'),('mount cameroan'),('mount cemeroan'),
('mount camnaraon'),('munt camraon'),('munt camaraon'),
('munt camaran'),('munt camnaraon'),('munt cameroan'),
('munt cemeroan'),('munt camnaraon'),('mount camraan');
select * from ft_test WHERE  MATCH(txt) AGAINST ("+mount +cameroon" IN BOOLEAN MODE);


Here is that sample data loaded:

mysql> use test
Database changed
mysql> drop table if exists ft_test;
Query OK, 0 rows affected (0.00 sec)

mysql> create table ft_test
    -> (
    ->     id int not null auto_increment,
    ->     txt text,
    ->     primary key (id),
    ->     FULLTEXT (txt)
    -> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into ft_test (txt) values
    -> ('mount camaroon'),('mount camaron'),('mount camnaroon'),
    -> ('mount cameroon'),('mount cemeroon'),('mount camnaroon'),
    -> ('mount camraon'),('mount camaraon'),('mount camaran'),
    -> ('mount camnaraon'),('mount cameroan'),('mount cemeroan'),
    -> ('mount camnaraon'),('munt camraon'),('munt camaraon'),
    -> ('munt camaran'),('munt camnaraon'),('munt cameroan'),
    -> ('munt cemeroan'),('munt camnaraon'),('mount camraan');
Query OK, 21 rows affected (0.00 sec)
Records: 21  Duplicates: 0  Warnings: 0

mysql>


Here is a sample query and its EXPLAIN plan

mysql> select * from ft_test WHERE  MATCH(txt) AGAINST ("cameroon" IN BOOLEAN MODE);
+----+----------------+
| id | txt            |
+----+----------------+
|  4 | mount cameroon |
+----+----------------+
1 row in set (0.00 sec)

mysql> explain select * from ft_test WHERE  MATCH(txt) AGAINST ("cameroon" IN BOOLEAN MODE)\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ft_test
         type: fulltext
possible_keys: txt
          key: txt
      key_len: 0
          ref:
         rows: 1
        Extra: Using where
1 row in set (0.00 sec)

mysql>


OK Great the FULLTEXT Index is used.

Now, let's change the query a slight bit

mysql> select * from ft_test WHERE  MATCH(txt) AGAINST ("cameroon" IN BOOLEAN MODE) = 1;
+----+----------------+
| id | txt            |
+----+----------------+
|  4 | mount cameroon |
+----+----------------+
1 row in set (0.00 sec)

mysql> explain select * from ft_test WHERE  MATCH(txt) AGAINST ("cameroon" IN BOOLEAN MODE) = 1\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: ft_test
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 21
        Extra: Using where
1 row in set (0.00 sec)

mysql>


OMG What happened to the FULLTEXT index? The MySQL Query optimizer basically barfed at it. If you were performing a JOIN with the ft_test table, once the WHERE clause on the fulltext search is issued and it does the same then, who knows what on earth will happen to the rest of the query.

The solution would be to refactor the query ans attempt to isolate the FULLTEXT search and gather the keys only. Then LEFT JOIN those keys to the original table.

EXAMPLE

SELECT B.*
FROM (SELECT id from ft_test
WHERE MATCH(txt) AGAINST ("+cameroon" IN BOOLEAN MODE)) A
LEFT JOIN ft_test B USING (id);


For this query, here is the result and its EXPLAIN

```
mysql> SELECT B.*
-> FROM (SELECT id from ft_test
-> WHERE MATCH(txt) AGAINST ("+camero

Code Snippets

echo "a"    > /var/lib/mysql/stopwords.txt
echo "an"  >> /var/lib/mysql/stopwords.txt
echo "the" >> /var/lib/mysql/stopwords.txt
[mysqld]
ft_min_word_len=1
ft_stopword_file=/var/lib/mysql/stopwords.txt
service mysql restart
use test
drop table if exists ft_test;
create table ft_test
(
    id int not null auto_increment,
    txt text,
    primary key (id),
    FULLTEXT (txt)
) ENGINE=MyISAM;
insert into ft_test (txt) values
('mount camaroon'),('mount camaron'),('mount camnaroon'),
('mount cameroon'),('mount cemeroon'),('mount camnaroon'),
('mount camraon'),('mount camaraon'),('mount camaran'),
('mount camnaraon'),('mount cameroan'),('mount cemeroan'),
('mount camnaraon'),('munt camraon'),('munt camaraon'),
('munt camaran'),('munt camnaraon'),('munt cameroan'),
('munt cemeroan'),('munt camnaraon'),('mount camraan');
select * from ft_test WHERE  MATCH(txt) AGAINST ("+mount +cameroon" IN BOOLEAN MODE);
mysql> use test
Database changed
mysql> drop table if exists ft_test;
Query OK, 0 rows affected (0.00 sec)

mysql> create table ft_test
    -> (
    ->     id int not null auto_increment,
    ->     txt text,
    ->     primary key (id),
    ->     FULLTEXT (txt)
    -> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.03 sec)

mysql> insert into ft_test (txt) values
    -> ('mount camaroon'),('mount camaron'),('mount camnaroon'),
    -> ('mount cameroon'),('mount cemeroon'),('mount camnaroon'),
    -> ('mount camraon'),('mount camaraon'),('mount camaran'),
    -> ('mount camnaraon'),('mount cameroan'),('mount cemeroan'),
    -> ('mount camnaraon'),('munt camraon'),('munt camaraon'),
    -> ('munt camaran'),('munt camnaraon'),('munt cameroan'),
    -> ('munt cemeroan'),('munt camnaraon'),('mount camraan');
Query OK, 21 rows affected (0.00 sec)
Records: 21  Duplicates: 0  Warnings: 0

mysql>

Context

StackExchange Database Administrators Q#11716, answer score: 9

Revisions (0)

No revisions yet.