HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Mongo Scaling - Too Many Connections - Linear Horizontal Scaling with 100s of App Servers

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
linearappwithserversmongoscalingtoo100shorizontalmany

Problem

I have a Mongo Cluster. It's sharded on 6 servers, with 6 replica sets, each replica set having one Primary, and two Secondaries. Additionally, there are 3 config servers. Though, I think the Mongo Cluster configuration is irrelevant to this question.

I have many application servers (currently 100), with the intention of having many more as my volume grows. On each application server, I have a mongos process, 8 http servers (node.js) each bound to a port (using HAProxy as a reverse proxy to route traffic from port 80). Each http server connects to the local mongos process using connection pooling with a poolSize of 3. Pretty simple so far.

The issue is that the number of connections seems to quite simply be:

(number of servers) * (number of http processes) * (poolSize)


In my current case that would be:

100 * 8 * 3 = 2400 total connections


So it would seem that the number of connections scales linearly with the number of servers I'm running. If I scale up 5x, then the number of connections goes to 9,600 which is dangerously close to the 10,000 limit (as well as seeming unnecessarily wasteful). What about a 10x or 100x scale up?

It is stated that "any connection count greater than 1000 – 1500 connections should raise an eyebrow". How can I possibly maintain that number of connections with 100s of application servers? Is a mongos process per application server at this size still appropriate?

Solution

The hard connection limit was actually 20,000 but it was removed in version 2.6 and you are now only limited by memory or open file descriptors (whichever you run out of first). That being said, with connections running into the thousands, even if you can manage it you will be using a lot of memory (1MB per connection for the stack).

At that point you start to evaluate other options for mongos deployment. You can have a dedicated set of mongos hosts and take advantage of some of the nice HA options built into the drivers and pass in an list of possible mongos processes to use. If you are concerned about latency, you could potentially collocate (some of) the mongos processes on your shard hosts.

Basically you've started to hit the number of nodes where you are into more advanced design decisions and it really depends on several factors as to what is right for you now, and what is right for you in the future. If you have some money to throw at it, this is perfect fodder for the type of consulting MongoDB offer (disclosure: I was once one of the people delivering such consulting). If not, you are asking the right questions, so keep an eye on resource utilisation and see what makes sense for your deployment as you grow.

Context

StackExchange Database Administrators Q#115127, answer score: 3

Revisions (0)

No revisions yet.