patternsqlMinor
Are database connections from an application expensive from a performance point of view?
Viewed 0 times
applicationpointareviewdatabaseperformancefromexpensiveconnections
Problem
A colleague tells me that connections to a database are expensive and could harm performance of the application. They ask me to create a procedure that calls say another 8 procedures, that way they call one procedure and get back 8 datasets. From my viewpoint SQL is having to wait for each dataset to be returned before it can start to retrieve the next one so that will have a negative effect too.
Can anybody quantify the cost of the connection from an application and let me know whether the best approach is to call multiple procedures or one master procedure?
Can anybody quantify the cost of the connection from an application and let me know whether the best approach is to call multiple procedures or one master procedure?
Solution
Well, there are four main aspects to this:
-
Why are you asking the general Interwebs for an opinion when there are three other sources that should be checked first, one of which will be definitive and not an opinion:
-
When someone tells you a technical / scientific piece of information, whenever possible, ask them how they know that to be the case. Did they read it somewhere? If so, was it a reputable source or was it mentioned in passing in a conversation they had several years ago that they were only partially paying attention to and also never asked that person how they happened to know this information. If it was an article / blog post, ask to see it so you can read it for yourself. If they did some testing, ask to see the test scripts so you can run them yourself.
The point is, a lot of people say a lot of things, sometimes being honestly convinced of the validity of information that is absolutely not true. So it is good of you to not just take their word for it, but whoever conveyed the information should be asked why they believe it to be true. Their answer could prove to be very insightful on a few levels :-).
-
Whether or not you are able to get an answer from the person making technical claims, do your own research. Do some searches to see if technical articles on this topic have already been written. Simply searching on "database connection performance" returns quite a few articles. Just be wary of random blog posts as people sometimes post information that they believe to be true but in fact isn't. And that brings us to....
-
The absolute best source of information for technical matters, especially with regards to "is approach A faster or slower than approach B" types of questions, is the system itself. Who cares what your colleague says, or even anyone else, no matter how well respected they may be, when your system will behave the way it is going to behave and it cannot be argued with. If someone says calling 8 procs in one call to the database is faster than 8 connections, well, try both and see which is faster. Of course, you may or may not be doing things correctly, but once you have a test set up and some results, then you can present those to see if somehow you missed something. And you can check against articles that have been written to see if their is a trick or option that you missed.
-
Generally speaking, yes, establishing new connections is expensive, which is why Connection Pooling exists. It keeps the connection open so that it can be re-used by the same client. Making a whole new connection requires allocating memory, checking security, which might involve making a request to yet another server if using Windows Authentication in an environment that uses Active Directory (since the auth info, if not yet cached, resides on the Domain Controller). All of this overhead is skipped when using Connection Pooling. If you are using .NET and connecting to Microsoft SQL Server, then Connection Pooling should be enabled by default. But you should check to make sure that it is used being.
Regardless of whether or not Connection Pooling is being used, are multiple calls in a single query batch faster than individual calls made via separate query batches? Well, if the issue is one of closing and re-opening the connection being expensive, then why close the connection? Why not just make 8 separate stored procedure calls on the same connection? At that point the only performance difference is the time it takes to make the 7 additional stored procedure requests. And that would be so minimal as to not even be worth thinking about unless this group of 8 stored procedures is being called many times per second. And that brings us too...
-
Most often, performance is a function of scale. There are many inefficient operations that do not show their inefficiency when only being called once or twice per day, or once or twice per month. But perform those operations multiple times per minute, or even second, and that same poor choice can have a drastic effect on performance. There are many bad choices people make when choosing datatypes for new fields (whether new table or just a new column on an existing table). They say, "disk is cheap so it doesn't matter than an INT is 3 bytes larger than a TINYINT and this field will only ever hold values of 1 - 10". An extra 3 bytes for 100 or even 1000 rows will never (or maybe I should just say "almost never" ;-) degrade performance. But that same field, if a PK, copied into 100+ tables as a Foreign Key, and those 100+ tables having a combined row count in the hundreds of millions range, will have an impact, and not just on query performance, but also backup and restore times, etc.
So, how often will these 8 stored procedures be called? If only a few times per hour or less, then you probably won't see much of a performance difference whether or not you call all 8 in one shot, or make 8 calls on the same connection, or make 8 calls on separate connect
-
Why are you asking the general Interwebs for an opinion when there are three other sources that should be checked first, one of which will be definitive and not an opinion:
-
When someone tells you a technical / scientific piece of information, whenever possible, ask them how they know that to be the case. Did they read it somewhere? If so, was it a reputable source or was it mentioned in passing in a conversation they had several years ago that they were only partially paying attention to and also never asked that person how they happened to know this information. If it was an article / blog post, ask to see it so you can read it for yourself. If they did some testing, ask to see the test scripts so you can run them yourself.
The point is, a lot of people say a lot of things, sometimes being honestly convinced of the validity of information that is absolutely not true. So it is good of you to not just take their word for it, but whoever conveyed the information should be asked why they believe it to be true. Their answer could prove to be very insightful on a few levels :-).
-
Whether or not you are able to get an answer from the person making technical claims, do your own research. Do some searches to see if technical articles on this topic have already been written. Simply searching on "database connection performance" returns quite a few articles. Just be wary of random blog posts as people sometimes post information that they believe to be true but in fact isn't. And that brings us to....
-
The absolute best source of information for technical matters, especially with regards to "is approach A faster or slower than approach B" types of questions, is the system itself. Who cares what your colleague says, or even anyone else, no matter how well respected they may be, when your system will behave the way it is going to behave and it cannot be argued with. If someone says calling 8 procs in one call to the database is faster than 8 connections, well, try both and see which is faster. Of course, you may or may not be doing things correctly, but once you have a test set up and some results, then you can present those to see if somehow you missed something. And you can check against articles that have been written to see if their is a trick or option that you missed.
-
Generally speaking, yes, establishing new connections is expensive, which is why Connection Pooling exists. It keeps the connection open so that it can be re-used by the same client. Making a whole new connection requires allocating memory, checking security, which might involve making a request to yet another server if using Windows Authentication in an environment that uses Active Directory (since the auth info, if not yet cached, resides on the Domain Controller). All of this overhead is skipped when using Connection Pooling. If you are using .NET and connecting to Microsoft SQL Server, then Connection Pooling should be enabled by default. But you should check to make sure that it is used being.
Regardless of whether or not Connection Pooling is being used, are multiple calls in a single query batch faster than individual calls made via separate query batches? Well, if the issue is one of closing and re-opening the connection being expensive, then why close the connection? Why not just make 8 separate stored procedure calls on the same connection? At that point the only performance difference is the time it takes to make the 7 additional stored procedure requests. And that would be so minimal as to not even be worth thinking about unless this group of 8 stored procedures is being called many times per second. And that brings us too...
-
Most often, performance is a function of scale. There are many inefficient operations that do not show their inefficiency when only being called once or twice per day, or once or twice per month. But perform those operations multiple times per minute, or even second, and that same poor choice can have a drastic effect on performance. There are many bad choices people make when choosing datatypes for new fields (whether new table or just a new column on an existing table). They say, "disk is cheap so it doesn't matter than an INT is 3 bytes larger than a TINYINT and this field will only ever hold values of 1 - 10". An extra 3 bytes for 100 or even 1000 rows will never (or maybe I should just say "almost never" ;-) degrade performance. But that same field, if a PK, copied into 100+ tables as a Foreign Key, and those 100+ tables having a combined row count in the hundreds of millions range, will have an impact, and not just on query performance, but also backup and restore times, etc.
So, how often will these 8 stored procedures be called? If only a few times per hour or less, then you probably won't see much of a performance difference whether or not you call all 8 in one shot, or make 8 calls on the same connection, or make 8 calls on separate connect
Context
StackExchange Database Administrators Q#122439, answer score: 5
Revisions (0)
No revisions yet.