patternsqlModerate
Linked Server Risks
Viewed 0 times
serverlinkedrisks
Problem
I'm implementing a new feature which requires data from databases on multiple servers. I just need to union data from all these servers and sort it. The two options that come to mind are:
-
Use linked servers and write a simple query to union and sort the
data which will run from one server and gather data from the others.
-
Use the application to gather the data from all servers, and send it
back to SQL Server to sort (don't want to implement the sort in the
application).
We run our servers in active/active clusters in SQL Server 2008 r2. All the databases have the same permissions, if you have access to one database/server, you have permission to them all. This is a public facing application (which requires user login).
What are the risks of using linked servers? Are there any security flaws I should be concerned with? Are there any issues running linked servers in active/active clusters? Would there be any significant performance issues compared to the alternative?
There seems to be a general negative "buzz" about linked servers, but I cannot find anything concrete that would lead me to believe there are any real concerns there.
-
Use linked servers and write a simple query to union and sort the
data which will run from one server and gather data from the others.
-
Use the application to gather the data from all servers, and send it
back to SQL Server to sort (don't want to implement the sort in the
application).
We run our servers in active/active clusters in SQL Server 2008 r2. All the databases have the same permissions, if you have access to one database/server, you have permission to them all. This is a public facing application (which requires user login).
What are the risks of using linked servers? Are there any security flaws I should be concerned with? Are there any issues running linked servers in active/active clusters? Would there be any significant performance issues compared to the alternative?
There seems to be a general negative "buzz" about linked servers, but I cannot find anything concrete that would lead me to believe there are any real concerns there.
Solution
Linked servers can work very well as long as you have thought out the implications:
-
Security: a key consideration is that if you have linked servers, if one gets compromised they are all at significant risk. Even if you have different credentials for each user different servers (which would stop an attacker getting at other resources if the only attack vector was leaked/discovered/guessed credentials) the link can effectively bypass all that. The link will also bypass protections that are hiding the other databases from the public network, such as a circumstance where one or more of the servers are not supplying data to a public interface so would not normally be visible through your firewalls by any means. You might think "well, isn't that same risk an issue with replication?" to which the answer is yes, but replication is between individual application databases and the linked server route could possible compromise other databases on the same server(s) as the link is at server level not DB level (of course you may be able to mitigate this risk by careful control of user access rights, but you at least need to be aware of it in your planning). As a side note on security: if the servers are not on the same site make sure that you use some form of VPN to link them over, rather than making SQL Server available on a public interface.
-
Bandwidth: If all the servers are in the same DC with nice, fast, unmetered connectivity between themselves then you might not need to worry about this one, but be more careful with more distant connections especially if your users will be able to run ad-hoc queries of some variety. Compression at the VPN link level will help greatly here for most data sets, but be aware that this will be at the expense of greater latency which could exacerbate the efficiency issue (see below).
-
Efficiency: If you are simply pulling chunks of data down the line then this is not a massive issue (but consider locking: see my next point), but as soon as you do anything by way of joins and so forth there are limits to what the query planner can do to optimise your requests. If it needs to make many index seeks that will create very slow-running queries if the servers are not local to each other due to network latency (the same problem is definitely present for local servers too, but to a lesser extent of course), and it may instead use an index scan (trading off bandwidth use to gain latency benefits) eating bandwidth and if it is holding locks (to avoid dirty read issues and so forth) this will affect other parts of the application too.
-
Locking/Concurrency: Going off-server will increase the run-time of queries, which will exacerbate locking issues you may not yet know you have and thereby severely reducing your application's concurrency and scalability. You need to be very careful if using regular and/or long-running cross server queries that you keep an eye on the locking issue and give planner hints as appropriate.
As long as you have sufficient provisions in place to manage the security and performance issues, I would not see a problem with using linked servers, though there may be better/safer/more-reliable/easier-to-secure ways to achieve the same result.
-
Security: a key consideration is that if you have linked servers, if one gets compromised they are all at significant risk. Even if you have different credentials for each user different servers (which would stop an attacker getting at other resources if the only attack vector was leaked/discovered/guessed credentials) the link can effectively bypass all that. The link will also bypass protections that are hiding the other databases from the public network, such as a circumstance where one or more of the servers are not supplying data to a public interface so would not normally be visible through your firewalls by any means. You might think "well, isn't that same risk an issue with replication?" to which the answer is yes, but replication is between individual application databases and the linked server route could possible compromise other databases on the same server(s) as the link is at server level not DB level (of course you may be able to mitigate this risk by careful control of user access rights, but you at least need to be aware of it in your planning). As a side note on security: if the servers are not on the same site make sure that you use some form of VPN to link them over, rather than making SQL Server available on a public interface.
-
Bandwidth: If all the servers are in the same DC with nice, fast, unmetered connectivity between themselves then you might not need to worry about this one, but be more careful with more distant connections especially if your users will be able to run ad-hoc queries of some variety. Compression at the VPN link level will help greatly here for most data sets, but be aware that this will be at the expense of greater latency which could exacerbate the efficiency issue (see below).
-
Efficiency: If you are simply pulling chunks of data down the line then this is not a massive issue (but consider locking: see my next point), but as soon as you do anything by way of joins and so forth there are limits to what the query planner can do to optimise your requests. If it needs to make many index seeks that will create very slow-running queries if the servers are not local to each other due to network latency (the same problem is definitely present for local servers too, but to a lesser extent of course), and it may instead use an index scan (trading off bandwidth use to gain latency benefits) eating bandwidth and if it is holding locks (to avoid dirty read issues and so forth) this will affect other parts of the application too.
-
Locking/Concurrency: Going off-server will increase the run-time of queries, which will exacerbate locking issues you may not yet know you have and thereby severely reducing your application's concurrency and scalability. You need to be very careful if using regular and/or long-running cross server queries that you keep an eye on the locking issue and give planner hints as appropriate.
As long as you have sufficient provisions in place to manage the security and performance issues, I would not see a problem with using linked servers, though there may be better/safer/more-reliable/easier-to-secure ways to achieve the same result.
Context
StackExchange Database Administrators Q#40184, answer score: 14
Revisions (0)
No revisions yet.