HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Phrases update query

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
queryphrasesupdate

Problem

My query is this:

UPDATE `phrases`
SET `phrases`.`count`=(SELECT COUNT(*) FROM `strings` WHERE `string` LIKE CONCAT('%', `phrases`.`phrase`, '%'))


My tables look like this:

CREATE TABLE `phrases` (
    `hash` varchar(32) NOT NULL,
    `count` int DEFAULT 0,
    `phrase` text NOT NULL,
     PRIMARY KEY (`hash`),
     KEY(`count`)
)


And

CREATE TABLE `strings` (
    `string` text NOT NULL,
)


phrases has 18,000 rows and strings has 1500 rows.

Solution

Since you're using a LIKE with wildcards, you're going to do a table-scan against both tables, running a total of 18000*1500 = 27000000 substring comparisons.

To optimize this, you need to use some fulltext index technology. I suggest Sphinx Search or Apache Solr. If you do this, you don't need to keep a count of how many matches there are, because the search index makes it a lot less expensive to get a count on demand.

MySQL also implements a FULLTEXT index type, but it is only supported in the MyISAM storage engine in current versions (up to 5.5). I don't recommend using MyISAM for important data.

MySQL 5.6 is developing a fulltext index for InnoDB.

Context

StackExchange Code Review Q#7060, answer score: 4

Revisions (0)

No revisions yet.