HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

MongoDB mongos router and microservices architecture

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
architecturemongodbroutermongosmicroservicesand

Problem

Any mongos process in your infrastructure can perform chunks migration while they hold the lock.

In a microservice architecture, processes are started and stopped more dynamically, i'm thinking for example in Kubernetes where you will restart Pods possibly frequently because there is no disruption.

How does that fit with killing mongos processes? If they are doing a chunk migration, what are the implications of that? In my cluster i see some 'Aborted' migrations (in sh.status()) and i believe they are cause by this.

How bad are those 'Aborted' migrations and how bad is killing mongos?

Solution

If a mongos is killed a running migration should still complete, there just won't be another balancing operation initiated until a new mongos is able to pick up the balancer lock. That lock may end up being held for a while (I am being purposely vague, just how long a stale lock sticks around is very hard to define).

Aborted chunks are usually an issue with the primaries (they do all the heavy lifting for a migration) rather than the mongos which is just doing the coordination. The most common cause of an abort is that there is already one happening, or there is cleanup happening because of previous migrations. If you want to see a full description of the whole process, have a look at this (~11 minute) video where I describe it in depth (for MongoDB University).

That video will also give you some idea as to what the steps are that you are seeing in the abort messages. If you want to debug the aborts further I will recommend that you get the relevant information about the aborts from the primaries involved, and the changelog so that we can get an idea of the step the abort happens, and what the view is from both sides of the migration.

Finally, in terms of recommendations as to how to deal with this, I would recommend running a mongos process somewhere more permanent which does not take actual traffic from your applications. You would basically run this in the same way you do a mongod - it could even share a host/vm/instance with a mongod. You can't specify which mongos takes a balancer lock but just by being the only permanent process it should have the advantage.

Context

StackExchange Database Administrators Q#150290, answer score: 3

Revisions (0)

No revisions yet.