HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

WHERE NOT IN SQL Performance

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
sqlwherenotperformance

Problem

SELECT TOP 1 l.ID 
FROM Leads l
WHERE 
     l.ID NOT IN (SELECT LeadID FROM LeadFollowups) 
     AND l.ID NOT IN (SELECT LeadID FROM LeadsWorking)
     AND l.ID NOT IN (SELECT LeadID FROM LeadsDead)
     AND l.ID NOT IN (SELECT LeadID FROM LeadHolds)
     AND l.ID >= (RAND() * (SELECT MAX(ID) FROM Leads))
ORDER BY l.QualityScore DESC


I have a table growing by a 10 to 100 thousand rows a day. I have several tables to track actions of the current leads that shouldn't be included when grabbing the new lead. My query is pretty quick now, but as this grows is NOT IN the most efficient solution for this? Especially if I get in to the range of ~10 million rows in the Leads table?

Solution

My query is pretty quick now, but as this grows is NOT IN the most efficient solution for this?

I expect it would be quicker if you had an indexed Status field in the Leads table (so that records in the LeadsDead table had a corresponding Status=Dead in the Leads table); and quicker still if Status were a clustered index, so that all the leads with a given Status were contiguous in the Leads table.

Context

StackExchange Code Review Q#42781, answer score: 6

Revisions (0)

No revisions yet.