HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Should a multi tenant system with SQL Server 2016, Shard or have Tenant isolation via separate database per tenant?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
isolationtenantmultiperwithsqlsystemshardseparatedatabase

Problem

Given the use case:

  • Tenant data should not cross talk, one tenant does not need another


tenant's data.

  • Each tenant could potentially have large historical data volume.



  • SQL Server is hosted in AWS EC2 instance.



  • Each tenant is geographically distant.



  • There is an intention to use third party visualization tools such as PowerBI Embedded



  • The data volume is expected to grow over time



  • The cost of the system is constrained.



  • The solution must be maintainable without a 24/7 production DBA



  • The solution should be able to scale horizontally.



  • Total number of tenants is less than 50



What would be a recommended architecture, are there any reference implementations for this use case? I believe many people might have already faced this problem for enterprise software development.

I think this is a different situation from Handling growing number of Tenants in Multi-tenant Database Architecture. The use case mentioned in that question deals with a higher number of tenants, which is very different from having very few (50) large tenants. The architecture mentioned might be a solution here, which is what I want to know more about.

Solution

The gotcha with sharding is that the application has to know which shard to query. Generally, this is done by sharding on something like client. I'll adapt one of my old blog posts to use as my answer.

When you’re building an application for lots of clients, there’s two common ways to design the database(s):

  • Option A: Put all clients in the same database



  • Option 2: Build one database per client



Putting All the Clients in the Same Database

It’s simple: just add a Client table at the top of the schema, add a ClientUsers table to make sure people only see their own data, and away we go.

Benefits of this approach:

Easier schema management. When developers deploy a new version of the application, they only have to make schema changes in one database. There’s no worries about different customers being out of sync or on the wrong version.

Easier performance tuning. We can check index usage and statistics in just one place, implement improvements easily, and see the effects immediately across all our clients. With hundreds or thousands of databases, even the smallest change can be difficult to coordinate. We can check our procedure cache contents and know for certain which queries or stored procedures are the most intensive across our entire application, whereas if we’re using separate databases per client, we may have a tougher time aggregating query use across different execution plans.

Easier to build an external API. If we need to grant access to our entire database for outsiders to build products, we can do that easier if all of the data is in a single database. If the API has to deal with grouping data from multiple databases on multiple servers, it adds development and testing time. (On the other hand, that “multiple servers” thing starts to hint at a restriction for the one-database-to-rule-them-all scenario: one database usually means all our load impacts just one database server.) In your case, with PowerBI, having everyone in one database will make managing connections much easier.

Easier high availability & disaster recovery. It’s really, really simple to manage database mirroring, log shipping, replication, and clustering if all we have to worry about is just one database. We can build a heck of an infrastructure quickly.

Putting Each Client in its Own Database or Shard

You still need a client listing, but now it becomes a directory - for each client, you also track the shard it lives in. On startup, your app queries this table, and caches it in RAM. When it needs data for a client, it connects directly to that shard (database & server).

Benefits of this approach:

Easier single-client restores. Clients are unreliable meatbags. (Except mine – they’re reliable meatbags.) They have all kinds of “oops” moments where they want to retrieve all of their data back to a point in time, and that’s a huge pain in the rear if their data is intermingled with other client data in the same tables. Restores in a single-client-database scenario are brain-dead easy: just restore the client’s database. No one else is affected.

Easier data exports. Clients love getting their hands on their data. They want the security of knowing they can get their data out anytime they want, avoiding the dreaded vendor lock-in scenario, and they want to do their own reporting. With each client’s data isolated into their own database, we can simply give them a copy of their own database backup. We don’t have to build data export APIs.

Easier multi-server scalability. When our application needs more power than we can get from a single server, we can divide up the databases between multiple servers. We can also spread out the load geographically, putting servers in Asia or Europe to be closer to clients.

Easier per-client performance tuning. If some clients use different features or reports, we can build a specialized set of indexes or indexed views just for those clients without growing everyone’s data size. Granted, there’s some risk here – by allowing schema differences between clients, we’ve just made our code deployments a little riskier and our performance management more difficult.

Easier security management. As long as we’ve properly locked down security with one user per database, we don’t have to worry about Client X accessing Client Y’s data. However, if we just use a single login for everyone, then we haven’t really addressed this concern.

Easier maintenance windows. In a global environment where customers are scattered around the globe, it’s easier to take customers offline for maintenance if we can do it in groups or zones.

Which one is right for you?

There’s no one right choice: you have to know your own company’s strengths and weaknesses. Let’s take two of my clients as examples.

Company A excels at hardware performance tuning. They’re really, really good at wringing the very last bit of performance out of hardware, and they don’t mind replacing their SQL Server hardware on a 12-18 m

Context

StackExchange Database Administrators Q#164043, answer score: 19

Revisions (0)

No revisions yet.