HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlMinor

Multiple SELECT vs single SELECT over TCIP

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
singlemultipletcipselectover

Problem

My question is a performance one:

I need to select roughly 400+ rows, i know their ids.

currently i loop in code and fire off multiple queries over tcp/ip - because its easy:

...
select [columns] where id = 1544;
select [columns] where id = 4765;
select [columns] where id = 3544;
select [columns] where id = 4090;
select [columns] where id = 5060;
...


Would it be more efficient on SQL server to fire off a single query for a date range instead (the rows I'm after are date-sequential, but we currently don't have a date field in that table), keeping in mind the overhead of multiple transactions vs a single transaction:

select [columns] where date between x AND y;

Solution

With SQL server (and, I suspect, pretty much everything else) querying for a range will be more efficient than querying for many individual rows, even when working locally. The range query will likely result in an index seek followed by a short scan through the few subsequent the index pages (followed by any other page lookups needed), then many single selects will result in many individual index seeks (followed by any other page lookups needed).

The "where in ()" option mentioned by user503207 may perform similarly to the date based range query in SQL Server, try it and see what query plan results, though in some DBMSs (mySQL for instance, unless that has improved in that area since I last used it) such queries can be quite inefficient (sometimes resulting in a full index scan).

As your IDs are in sequential order by date and you know which ones you are needing (or you wouldn't be able to put together the list of individual selects) then you could instead sort them when producing your query to find the range and use where id between and which would also avoid needing to add the date column.

When considering doing the same remotely the query performance difference will be the same if all the selects are sent as a single batch, though you will see a little more bandwidth used because transferring many small resultselts is less efficient than a single larger one containing the same rows. If you are submitting the selects individually then you will see a significant slowdown as you are adding network latency to each individual query instead of to the batch as a whole.

Context

StackExchange Database Administrators Q#67934, answer score: 2

Revisions (0)

No revisions yet.