patternsqlModerate

mysqldump --single-transaction, yet update queries are waiting for the backup

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

theupdatearewaitingmysqldumpyetsingletransactionforqueries

Problem

If I use mysqldump --single-transaction, according to the docs it should do flush tables with read lock to get a consistent state and then start a transaction and no writers should be waiting.

However, I have caught the following situation last night:

excerpt from show full processlist:

hundreds of those...

Command: Query
   Time: 291
   State: Waiting for table flush
   Info: insert into db_external_notification.....

then this:

Command: Query
Time: 1204
State: Sending data
Info: SELECT /*!40001 SQL_NO_CACHE */ * FROM `db_external_notification`

and the rest of the threads are in Sleep

does anyone have any idea what are these inserts waiting for? I don't see any FLUSH tables or DDL or anything mentioned in the manual that can cause the queries to wait.

full mysqldump command

mysqldump --quick --add-drop-table --single-transaction --master-data=2 -uxx -pxx dbname

I guess --quick is redundant here, probably a leftover from earlier times, this script is very old, but shouldn't hurt anyting

Solution

The --single-transaction option of mysqldump does do a FLUSH TABLES WITH READ LOCK prior to starting the backup job but only under certain conditions. One of those conditions is when you also specify the --master-data option.

In the source code, from mysql-5.6.19/client/mysqldump.c at line 5797:

if ((opt_lock_all_tables || opt_master_data ||
     (opt_single_transaction && flush_logs)) &&
    do_flush_tables_read_lock(mysql))
  goto err;

To get a solid lock on the precise binlog coordinates prior to starting the repeatable-read transaction, the --master-data option triggers this lock to be obtained and then released once the binlog coordinates have been obtained.

In fact, mysqldump does a FLUSH TABLES followed by a FLUSH TABLES WITH READ LOCK because doing both things allows the read lock to be obtained faster in cases where the initial flush takes some time.

...however...

As soon as it has obtained the binlog coordinates, mysqldump issues an UNLOCK TABLES statement, so there shouldn't be anything blocking as a result of the flush you started. Neither should any threads be Waiting for table flush as a result of the transaction that mysqldump is holding.

When you see a thread in the Waiting for table flush state, that should mean that the FLUSH TABLES [WITH READ LOCK] statement was issued and was still running when the query started -- so the query has to wait for the table flush, before it can execute. In the case of the processlist you've posted, mysqldump is reading from this same table, and the query has been running for a while, yet the blocking queries haven't been blocking for all that long.

This all suggests that something else has happened.

There's a long-standing issue explained in Bug #44884 with the way FLUSH TABLES works, internally. I would not be surprised if the issue still persists, I would be surprised if this issue is ever "fixed" because it is a very complex issue to resolve -- virtually impossible to truly fix in a high concurrency environment -- and any attempt at fixing it carries a significant risk of breaking something else, or creating new, different, and still undesirable, behavior.

It seems likely that this will be the explanation for what you're seeing.

Specifically:

-
if you have a long-running query running against a table, and issue FLUSH TABLES, then the FLUSH TABLES will block until the long-running query completes.

-
additionally, any queries that begin after the FLUSH TABLES is issued will block until the FLUSH TABLES is complete.

-
additionally, if you kill the FLUSH TABLES query, the queries that are blocking will still block on the original long-running query, the one that was blocking the FLUSH TABLES query, because even though the killed FLUSH TABLES query didn't finish, that table (the one, or more, involved with the long-running query) is still in the process of being flushed, and that pending flush is going to happen as soon as the long-running query finishes -- but not before.

The likely conclusion here is that another process -- perhaps another mysqldump, or an ill-advised query, or a poorly-written monitoring process tried to flush a table.

That query was subsequently killed or timed out by an unknown mechanism, but its after-effects lingered until mysqldump finished reading from the table in question.

You can replicate this condition by trying to FLUSH TABLES while a long-running query is in process. Then start another query, which will block. Then kill the FLUSH TABLES query, which won't unblock the latest query. Then kill the first query, or let it finish, and the final query will successfully run.

As an afterthought, this is unrelated:

Trx read view will not see trx with id >= 1252538405, sees < 1252538391

That's normal, because mysqldump --single-transaction issues a START TRANSACTION WITH CONSISTENT SNAPSHOT, which prevents it from dumping data that was changed while the dump was in progress. Without that, the binlog coordinates obtained at the start would be meaningless, since the --single-transaction would not be what it claims to be. This should not in any sense be related to the Waiting for table flush issue, as this transaction obviously holds no locks.

Code Snippets

if ((opt_lock_all_tables || opt_master_data ||
     (opt_single_transaction && flush_logs)) &&
    do_flush_tables_read_lock(mysql))
  goto err;

Trx read view will not see trx with id >= 1252538405, sees < 1252538391

Context

StackExchange Database Administrators Q#71961, answer score: 11

Revisions (0)

No revisions yet.