patternMajor
What is more efficient, a where clause or a join with million plus row tables?
Viewed 0 times
milliontableswhatwithwheremoreefficientjoinplusrow
Problem
We run a website that has 250MM rows in one table and in another table that we join it to for most queries has just under 15MM rows.
Sample structures:
We regularly have to do a few queries against all these tables. One is grabbing statistics for free users (~10k free users).
Problem is this query will some times run a long damn time due to the fact that the joins happens long before the where.
In this case would it be wiser to use wheres instead of joins or possibly
Sample structures:
MasterTable (Id, UserId, Created, Updated...) -- 15MM Rows
DetailsTable (Id, MasterId, SomeColumn...) -- 250MM Rows
UserTable (Id, Role, Created, UserName...) -- 12K RowsWe regularly have to do a few queries against all these tables. One is grabbing statistics for free users (~10k free users).
Select Count(1) from DetailsTable dt
join MasterTable mt on mt.Id = dt.MasterId
join UserTable ut on ut.Id = mt.UserId
where ut.Role is null and mt.created between @date1 and @date2Problem is this query will some times run a long damn time due to the fact that the joins happens long before the where.
In this case would it be wiser to use wheres instead of joins or possibly
where column in(...)?Solution
For modern RDBMS there is no difference between "explicit JOIN" and "JOIN-in-the-WHERE" (if all JOINS are INNER) regards performance and query plan.
The explicit JOIN syntax is clearer and less ambiguous (see links below)
Now, the JOIN-before-WHERE is logical processing not actual processing and the modern optimisers are clever enough to realise this.
Your problem here is most likely indexing.
Please show us all indexes and keys on these tables. And the query plans
Note: this question would have been close on StackOverflow for being a duplicate by now... COUNT(1) vs COUNT(*) is another busted myth too.
The explicit JOIN syntax is clearer and less ambiguous (see links below)
Now, the JOIN-before-WHERE is logical processing not actual processing and the modern optimisers are clever enough to realise this.
Your problem here is most likely indexing.
Please show us all indexes and keys on these tables. And the query plans
Note: this question would have been close on StackOverflow for being a duplicate by now... COUNT(1) vs COUNT(*) is another busted myth too.
- https://stackoverflow.com/questions/5654278/sql-join-is-there-a-difference-between-using-on-or-where/5654338#5654338
- https://stackoverflow.com/questions/3003457/count-vs-countcolumn-name-which-is-more-correct/3003533#3003533
- https://stackoverflow.com/questions/1221559/count-vs-count1/1221649#1221649
Context
StackExchange Database Administrators Q#3480, answer score: 24
Revisions (0)
No revisions yet.