HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Full text search results in a large amount of time spent in 'FULLTEXT initialization'

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fullfulltextsearchamounttexttimelargespentresultsinitialization

Problem

I'm currently trying to run some queries against a data dump of Stack Overflow's comments. Here's what the schema looks like:

CREATE TABLE `socomments` (
  `Id` int(11) NOT NULL,
  `PostId` int(11) NOT NULL,
  `Score` int(11) DEFAULT NULL,
  `Text` varchar(600) NOT NULL,
  `CreationDate` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `UserId` int(11) NOT NULL,
  PRIMARY KEY (`Id`),
  KEY `idx_socomments_PostId` (`PostId`),
  KEY `CreationDate` (`CreationDate`),
  FULLTEXT KEY `Text` (`Text`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8


I ran this query against the table, and it ran incredibly slow (It does have 29 million rows, but it has a Full-Text index):

SELECT *
FROM socomments
WHERE MATCH (Text) AGAINST ('"fixed the post"' IN BOOLEAN MODE)


So I profiled it, the results of which are:

|| Status                     || Duration ||
|| starting                   || 0.000058 ||
|| checking permissions       || 0.000006 ||
|| Opening tables             || 0.000014 ||
|| init                       || 0.000019 ||
|| System lock                || 0.000006 ||
|| optimizing                 || 0.000007 ||
|| statistics                 || 0.000013 ||
|| preparing                  || 0.000005 ||
|| FULLTEXT initialization    || 207.1112 ||
|| executing                  || 0.000009 ||
|| Sending data               || 0.000856 ||
|| end                        || 0.000004 ||
|| query end                  || 0.000004 ||
|| closing tables             || 0.000006 ||
|| freeing items              || 0.000059 ||
|| logging slow query         || 0.000037 ||
|| cleaning up                || 0.000046 ||


As you can see, it spends a long time in FULLTEXT initialization. Is this normal? If not, how would I fix it?

Solution

Others have found this a troublesome situation

  • InnoDB Full-text Search in MySQL 5.6: Part 3, Performance



  • InnoDB Fulltext Search WITH QUERY EXPANSION is very slow



  • Sep 16, 2010



  • 5.0.51a, 5.0.75, 5.0 bzr



  • Status : No Feedback



  • SELECT query hang in state FULLTEXT initialization



  • Oct 17, 2014



  • MySQL 5.6.19



  • Status : Won't Fix



Since the MySQL Documentation is very terse on this thread state


FULLTEXT initialization


The server is preparing to perform a natural-language full-text search.

your only recourse would be to make preparation with less data. How ?

SUGGESTION #1

Look at your query again. It is selecting all columns. I would refactor the query to collect only the id columns from socomments. Then, join those retrieved ids back to the socomments table.

SELECT B.* FROM
(SELECT id FROM socomments
WHERE MATCH (Text) AGAINST ('"fixed the post"' IN BOOLEAN MODE)) A
LEFT JOIN socomments B USING (id);


This might produce an uglier EXPLAIN plan but I think profiling will change for the better. The basic idea is : If you have an aggressive FULLTEXT Search, make it gather the least amount of data during that FULLTEXT initialization phase, thus reducing the time.

I have recommended this many times before

  • May 14, 2012 : slow query with fulltext and left join



  • Mar 18, 2012 : Why is LIKE more than 4x faster than MATCH...AGAINST on a FULLTEXT index in MySQL?



  • Jan 26, 2012 : Mysql fulltext search my.cnf optimization :



  • Oct 25, 2011 : FULLTEXT index ignored in BOOLEAN MODE with 'number of words' conditional



SUGGESTION #2

Please make sure you are setting the InnoDB-based FULLTEXT options, not the ones for MyISAM. The two options you should be concerned are

  • innodb_ft_cache_size



  • Def Value 8000000 (7.629M)



  • Max Value 80000000 (76.29M)



  • innodb_ft_total_cache_size



  • Def Value 640000000 (610M)



  • Max Value 1600000000 (1525M = 1.49G)



Think about it for a moment. The text field is VARCHAR(600). Say the average is 300 bytes. You have 29,000,000 million of them. That would be that would be a little of 8GB. Perhaps increasing innodb_ft_cache_size and innodb_ft_total_cache_size may also help.

Make sure you have enough RAM for larger InnoDB FULLTEXT buffers.

GIVE IT A TRY !!!

Code Snippets

SELECT B.* FROM
(SELECT id FROM socomments
WHERE MATCH (Text) AGAINST ('"fixed the post"' IN BOOLEAN MODE)) A
LEFT JOIN socomments B USING (id);

Context

StackExchange Database Administrators Q#87089, answer score: 6

Revisions (0)

No revisions yet.