patternsqlMinor
Tag badges per user SEDE query
Viewed 0 times
pertaguserquerybadgessede
Problem
I wanted to see where I lined up with other users on tag badges (I wanted to count just bronze but figured I would start with all of them to start out with).
Here is the query that I ended up with:
What can be done to make this a better query?
Here is the query that I ended up with:
SELECT
Users.Id as [User Link]
, Users.DisplayName
, COUNT (Badges.Id) AS Number_of_Tag_Badges
FROM
Badges
INNER JOIN Tags ON Tags.TagName = Badges.Name
INNER JOIN Users ON Users.Id = Badges.UserId
GROUP BY Users.DisplayName, Users.Id
ORDER BY Number_of_Tag_Badges DESCWhat can be done to make this a better query?
Solution
Implementation
The query is quite well written for what it does. I just have a few minor nitpicks about its implementation.
The
You're getting the disadvantages of the comma-first convention without reaping the benefits. The advantage of putting commas first is that it makes it foolproof to add or remove columns in the
Remove the space before the parenthesis in the
The joins would be better if the tables were listed linearly:
$$ \fbox{Users} \overset{\texttt{UserId}}{\longleftrightarrow}
\fbox{Badges} \overset{\texttt{Name}}{\longleftrightarrow} \fbox{Tags}$$
Since the displayed results are user-centric, I would suggest listing the tables as
Concept
It would be handy if you added a rank column.
The query doesn't distinguish between gold, silver, and bronze badges. You should be able to report the breakdown. The convention used by the International Olympic Committee is to sort by the gold medal count first, rather than by the total. Perhaps you should do the same.
The query is quite well written for what it does. I just have a few minor nitpicks about its implementation.
The
FROM, GROUP BY, and ORDER BY clauses are all subsidiary to the SELECT clause. Therefore, I would indent them by one level. (Indenting your INNER JOIN clauses so that they are part of the FROM clause was good.)You're getting the disadvantages of the comma-first convention without reaping the benefits. The advantage of putting commas first is that it makes it foolproof to add or remove columns in the
SELECT: you should be able to delete an entire line of text without having to worry about fixing commas on neighbouring lines. However, to take advantage of that property, you should put the first column on the same line as the SELECT keyword:SELECT Users.Id as [User Link]
, Users.DisplayName
, COUNT (Badges.Id) AS Number_of_Tag_Badges
FROM …Remove the space before the parenthesis in the
COUNT (…) call.The joins would be better if the tables were listed linearly:
$$ \fbox{Users} \overset{\texttt{UserId}}{\longleftrightarrow}
\fbox{Badges} \overset{\texttt{Name}}{\longleftrightarrow} \fbox{Tags}$$
Since the displayed results are user-centric, I would suggest listing the tables as
Users, Badges, Tags rather than any other order.Concept
It would be handy if you added a rank column.
The query doesn't distinguish between gold, silver, and bronze badges. You should be able to report the breakdown. The convention used by the International Olympic Committee is to sort by the gold medal count first, rather than by the total. Perhaps you should do the same.
WITH EarnedTagBadges AS (
SELECT Users.DisplayName
, Badges.UserId
, Badges.Name
, RANK() OVER (PARTITION BY Users.Id, Badges.Name ORDER BY Date) AS Level
FROM Users
INNER JOIN Badges ON Users.Id = Badges.UserId
INNER JOIN Tags ON Badges.Name = Tags.TagName
), UserTagBadges AS (
SELECT UserId
, DisplayName
, COUNT(CASE WHEN Level = 3 THEN 1 END) AS Gold
, COUNT(CASE WHEN Level = 2 THEN 1 END) AS Silver
, COUNT(CASE WHEN Level = 1 THEN 1 END) AS Bronze
, COUNT(Level) AS Total
FROM EarnedTagBadges
GROUP BY UserId, DisplayName
)
SELECT RANK() OVER (ORDER BY Gold DESC, Silver DESC, Bronze DESC) AS Rank
, UserId AS [User Link]
, DisplayName
, Gold
, Silver
, Bronze
, Total
FROM UserTagBadges
ORDER BY Gold DESC, Silver DESC, Bronze DESC;Code Snippets
SELECT Users.Id as [User Link]
, Users.DisplayName
, COUNT (Badges.Id) AS Number_of_Tag_Badges
FROM …WITH EarnedTagBadges AS (
SELECT Users.DisplayName
, Badges.UserId
, Badges.Name
, RANK() OVER (PARTITION BY Users.Id, Badges.Name ORDER BY Date) AS Level
FROM Users
INNER JOIN Badges ON Users.Id = Badges.UserId
INNER JOIN Tags ON Badges.Name = Tags.TagName
), UserTagBadges AS (
SELECT UserId
, DisplayName
, COUNT(CASE WHEN Level = 3 THEN 1 END) AS Gold
, COUNT(CASE WHEN Level = 2 THEN 1 END) AS Silver
, COUNT(CASE WHEN Level = 1 THEN 1 END) AS Bronze
, COUNT(Level) AS Total
FROM EarnedTagBadges
GROUP BY UserId, DisplayName
)
SELECT RANK() OVER (ORDER BY Gold DESC, Silver DESC, Bronze DESC) AS Rank
, UserId AS [User Link]
, DisplayName
, Gold
, Silver
, Bronze
, Total
FROM UserTagBadges
ORDER BY Gold DESC, Silver DESC, Bronze DESC;Context
StackExchange Code Review Q#73483, answer score: 8
Revisions (0)
No revisions yet.