HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

MySQL, optimize and replication

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
andmysqloptimizereplication

Problem

I have two MySQL (MariaDB actually) servers, a master and a slave with standard replication.
Two times already in less than a month I've had my slave become unusable.

This is what I experienced:

  • Notice a drop in query per second on the slave with my monitoring tool



  • Connect to the server with the mysql client, run SHOW SLAVE STATUS \G



  • Notice everything looks normal (Waiting for master to send event, seconds behind master = 0, etc)



  • Try and stop the replication with STOP SLAVE



  • STOP SLAVE hangs



  • Impossible to connect to the server with a client



  • Execute SHOW SLAVE STATUS \G on another already connected client, and it hangs too.



After that, I saw the MySQL process using CPU, and I saw a lot of disk writes with iotop, so I let it do its stuff for a while.

After 15 minutes I killed -9 the MySQL process because I got frustrated. I then restart MySQL, and I can connect and run queries, good.

I execute SHOW SLAVE STATUS \G which now works, and it looks normal: seconds behind master = 0, waiting for events. Then I go on the master and execute CREATE DATABASE foobar to see if the replication works, and it doesn't.

Again, at that point I see a lot of disk writes so I assume the replication is catching up.
Two hours later, I finally decide to check SHOW PROCESSLIST and all I see being executed is a OPTIMIZE TABLE table1 ... and then I remember that I had that exact same problem last time.

I kill the process with that query, then I reexecute SHOW PROCESSLIST and sure enough it's catching up with the master. SHOW SLAVE STAUTS \G reported 15k seconds behind master, which was coherent.

Anyway, after that wall of text, my questions:

  • why does a OPTIMIZE query block the replication catching up ?



  • can I prevent those OPTIMIZE queries ? I didn't put any cron to do that, and as far as I can see, nothing in /etc/cron.{d,daily,hourly,weekly} does this.



Thanks.

Solution

Your First Question


why does a OPTIMIZE query block the replication catching up ?

The SQL thread in MySQL Replication processes SQL commands from the Relay Logs as a FIFO queue (i.e. first-come, first-server basis). Once the SQL thread see a command, it processes it to completion. Meanwhile, the Seconds_Behind_Master just keeps creeping up.

While the SQL thread on the Slave is running OPTIMIZE TABLE, once you run STOP SLAVE; it SQL thread has complete the running OPTIMIZE TABLE before the SQL thread can terminate. That's why STOP SLAVE; hangs. To verify this, open another mysql session and run SHOW PROCESSLIST;, You should see a line that has OPTIMIZE TABLE with the status Killing slave....

Your Second Question


can I prevent those OPTIMIZE queries ? I didn't put any cron to do that, and as far as I can see, nothing in /etc/cron.{d,daily,hourly,weekly} does this.

Someone or something is running the OPTIMIZE TABLE. You have to hunt it down in all your jobs or PHP apps. Use the mysqlbinlog utility to dumps all the binary logs into a text file. Then, grep the text file for OPTIMIZE TABLE or optimize table. The timestamp of when it was submitted into the binlogs will be a few lines about the command.

When you do find that OPTIMIZE TABLE command in your jobs or PHP code, simply run this:

SET SQL_LOG_BIN=0; OPTIMIZE TABLE tblname; SET SQL_LOG_BIN=1;


This will not register the OPTIMIZE TABLE in the binlogs, thus preventing a Slave from replicating it.

As was just pointed out in the comments below, You can also replace

OPTIMIZE TABLE tblname;


with one of the following:

OPTIMIZE NO_WRITE_TO_BINLOG TABLE tblname;
OPTIMIZE LOCAL TABLE tblname;


to restrict the OPTIMIZE TABLE command from landing in the binlogs and thus prevent replication of it.

Code Snippets

SET SQL_LOG_BIN=0; OPTIMIZE TABLE tblname; SET SQL_LOG_BIN=1;
OPTIMIZE TABLE tblname;
OPTIMIZE NO_WRITE_TO_BINLOG TABLE tblname;
OPTIMIZE LOCAL TABLE tblname;

Context

StackExchange Database Administrators Q#44712, answer score: 5

Revisions (0)

No revisions yet.