HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

How bad is it, really, to use InnoDB table as a queue?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
reallyinnodbbadhowusequeuetable

Problem

I have a product which uses two separate tables as queues for some processes. Generally speaking:

  • Interesting events occur in the application, and we write a record to the appropriate queue. That "queued" record points to another table whose records are a long-lasting record of the thing having been processed.



  • A background process comes along and processes the queue later, marks the job as "complete" and deletes the record from the queue.



This setup gives us several advantages over simply having the application do whatever the queue-processor is going to do:

  • If the action to be taken is long-running for some reason, the application need not be blocked.



  • If the action fails, it can be re-tried the next time the queue-processor runs.



  • The queue-processors can be multi-process/thread is desired.



Mostly I'm just describing reasons to use queues above, it of course has nothing to do with using the db as the underlying structure.

We are looking at "advantage #3' above a little closely, and expanding our queue-processors to be not just multi-process but also multi-server which means we need to be a little more careful about how we process that queue, because each item must only be successfully processed once. Two or more times is generally not okay.

So one of the questions we naturally asked ourselves was "well, should we even be using the database for this queue?"

That got me to thinking about how we use the queue, and why we might not want to do it this way. Briefly:

  • SELECT id, queued_thing_id FROM queue WHERE [fairly general WHERE clause] FOR UPDATE



  • For each record:



  • Do stuff



  • UPDATE queued_things SET status='complete', completed_date=NOW() WHERE id=[queued_thing_id]



  • DELETE FROM queue WHERE id=[id]



Potential reasons not to use a db:

-
"This is wasteful."

No it's not. The records are very compact.

-
"Yeah but the table will grow forever."

No it won't. Every DELETE makes room for another row. We aren't talking about millions of

Solution

There are several downsides to using an RDBMS to implement a queue.

The first is polling queries. Your queue processor needs to run a query at regular intervals to check if anything is in the queue. How frequently should this polling query run? It depends on how promptly you want the processor to notice the presence of a queued item, and also on the throughput you need to keep up with the rate of new enqueued items. But the polling queries need to run all the time, just in case something is added to the queue.

If you have multiple processors, then each of them has to run its own polling queries against the database.

The worst case I observed was a deployment with queue processors on many application servers, all polling constantly, several times a second. The result was that the database had to serve 3,000 queries per second just for the polling by queue processors. Each query is individually not a big deal, but trying to serve 3k of them per second was causing a lot of load on that database server.

The second problem was locking. You need to come up with a complex algorithm to ensure only one client processes each item. This involves locking the row FOR UPDATE as you describe. In the scenario where you have multiple clients, this can cause lock contention. They end up waiting for each other to release their locks on the queue, and thus the parallelism is limited.

You mention using DELETE to remove items from the queue once they are finished. In theory, InnoDB reuses pages if it can. In practice, this causes progressive fragmentation in the file. Not every page is filled efficiently, so new pages gradually are added to the tablespace. InnoDB doesn't do any automatic defragmentation. This is what OPTIMIZE TABLE or TRUNCATE TABLE do (actually, they don't do defragmentation inplace either, they just make a new tablespace and drop the old one).

Some of these problems are mitigated if you use a kind of dispatcher architecture. That is, you only have one processor polling the queue. Then you don't multiply the queries per second by the number of workers. And you don't have lock contention, because only one client is pulling from the queue. When that dispatcher finds an item to work on, it delegates the work to one of the unoccupied workers, and resumes watching the queue.

Still, it's better to use a real queue service. This does not require polling or locking. The client applications each make one call to the queue, to request an item. If there is no item in the queue, the call blocks until there's an item in the queue (or there may be an option to time out). Once there is something to return from the queue, this call returns that item to the client immediately. This also guarantees against multiple clients receiving the same item from the queue.

So back to your original question: "how bad is it?" Well, as you have found, it works fine ... until you need to process the queue at a high rate. Eventually you need to run more polling queries than the database server can handle. Or there's so much lock contention that the queue processors can't get their items to work on in a timely way.

So it's fine for toy projects, not so good for high-scale projects.

Context

StackExchange Database Administrators Q#319910, answer score: 3

Revisions (0)

No revisions yet.