patternsqlMinor
Grouping counts from tables by account ID
Viewed 0 times
tablesgroupingaccountfromcounts
Problem
I'm working with this query to get counts from three different tables, and to group the results by account ID:
However, it runs too slowly, and times out the task handler I'm calling it in. Is there a way I can write this to run faster?
SELECT accountId , SUM(ApplicantsCount) as ApplicantsCount, SUM(ApprovedCount) as ApprovedCount, SUM(ScreenedCount) as ScreenedCount
FROM (
SELECT application.accountId, COUNT(*) ApplicantCount, 0 ApprovedCount, 0 ScreenedCount
FROM application
[accountIdCondition]
GROUP BY application.accountId
UNION
SELECT application.accountId, 0, COUNT(*), 0
FROM application
JOIN termsofapproval ON application.accountId = termsofapproval.accountId
JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition]
GROUP BY application.accountId
UNION
SELECT application.accountId, 0, 0, COUNT(*)
FROM application
JOIN screened
ON application.id = screened.applicationId
[accountIdCondition]
GROUP BY application.accountId
) CountsTable
GROUP BY CountsTable.accountIdHowever, it runs too slowly, and times out the task handler I'm calling it in. Is there a way I can write this to run faster?
Solution
I took the liberty of throwing this in to an SQLFiddle here
If I play with the query, and run the screened and approved queries on the data that I chose, I see you have a condition which may or may not be a bug. If you consider my data, where I have multiple accountId values per application, then, your subquery:
That query will duplicate the count of approved users if there are multiple applications for the same accountId. In the SQLFiddle I have, it returns a count of 4 for only 2 distinct approvaluserjoin records.
It is likely that in your data it is not possible to get that condition though... right?
Regardless. I believe the more logical representation of your query is as follows (which I have in this SQLFiddle here
Note how there is only one join, using left outer joins. Also note that a count of a null value is 0, so the null values in the outer-join results do not contribute to the sum. The
You will need to carefully understand the query, the implications are different to yours, and it may be more accurate than what you have (or less accurate).
If I play with the query, and run the screened and approved queries on the data that I chose, I see you have a condition which may or may not be a bug. If you consider my data, where I have multiple accountId values per application, then, your subquery:
SELECT application.accountId, 0, COUNT(*), 0
FROM application
JOIN termsofapproval ON application.accountId = termsofapproval.accountId
JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition]
GROUP BY application.accountIdThat query will duplicate the count of approved users if there are multiple applications for the same accountId. In the SQLFiddle I have, it returns a count of 4 for only 2 distinct approvaluserjoin records.
It is likely that in your data it is not possible to get that condition though... right?
Regardless. I believe the more logical representation of your query is as follows (which I have in this SQLFiddle here
SELECT application.accountId ,
count(distinct application.id) as ApplicantsCount,
count(distinct approvaluserjoin.id) as ApprovedCount,
count(distinct screened.id) as ScreenedCount
FROM application
left join termsofapproval on application.accountId = termsofapproval.accountId
left join approvaluserjoin on approvaluserjoin.termsofapprovalId = termsofapproval.id
left join screened on screened.applicationId = application.id
where application.accountId <= 2
group by application.accountIdNote how there is only one join, using left outer joins. Also note that a count of a null value is 0, so the null values in the outer-join results do not contribute to the sum. The
coutn(distinct ...) construct allows you to count the things you are interested in, even if the query returns them in multiple contexts.You will need to carefully understand the query, the implications are different to yours, and it may be more accurate than what you have (or less accurate).
Code Snippets
SELECT application.accountId, 0, COUNT(*), 0
FROM application
JOIN termsofapproval ON application.accountId = termsofapproval.accountId
JOIN approvaluserjoin ON termsofapproval.id = approvaluserjoin.termsofapprovalId
[accountIdCondition]
GROUP BY application.accountIdSELECT application.accountId ,
count(distinct application.id) as ApplicantsCount,
count(distinct approvaluserjoin.id) as ApprovedCount,
count(distinct screened.id) as ScreenedCount
FROM application
left join termsofapproval on application.accountId = termsofapproval.accountId
left join approvaluserjoin on approvaluserjoin.termsofapprovalId = termsofapproval.id
left join screened on screened.applicationId = application.id
where application.accountId <= 2
group by application.accountIdContext
StackExchange Code Review Q#67729, answer score: 3
Revisions (0)
No revisions yet.