HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

More than one answer by the same user

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
sametheansweruserthanmoreone

Problem

I hacked together this little query on the Data Explorer

The query finds questions where an answerer has posted two or more answers (which in itself is not necessarily a bad thing), it gives you the link to the question, information about which user posted the multiple answers, and how many answers by that user that was posted.

select question.Id as [Post Link], 
   answer.owneruserid as [User Link], 
   count(answer.id) as answers
from Posts question
join Posts answer on (answer.parentid = question.id)
where answer.OwnerUserId != 0
group by answer.OwnerUserId,
  question.id, answer.owneruserid
having count(answer.id) >= 2
order by answers desc


The last two group by clauses seem quite useless to me in a way, as they will always only be one unique version of them after grouping by answer.OwnerUserId. My experience from mySQL is that it simply picks a seemingly random value if not grouping by it, but in the SEDE I get this result:


Column 'Posts.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

  • Is there a way to remove these extra GROUP BY's? Or is adding these "the way it's supposed to be done"?



  • How is this SQL overall, regarding performance, naming, structure, conventions and everything?



  • Are there any alternative ways to rewrite this query?

Solution

You don't need to perform any join at all.

SELECT ParentId AS [Post Link]
     , OwnerUserId AS [User Link]
     , COUNT(Id) AS Answers
    FROM Posts
    WHERE
        ParentId IS NOT NULL           -- ← Answers only
        AND OwnerUserId <> 0
    GROUP BY ParentId, OwnerUserId
    HAVING Count(id) > 1
    ORDER BY Answers DESC;


Your instinct probably tells you that you need a JOIN to obtain the question titles. However, in Stack Exchange Data Explorer, there is a special feature: provide a PostId in a column named [Post Link], and it will take care of the presentation for you (implicitly doing a JOIN).

Code Snippets

SELECT ParentId AS [Post Link]
     , OwnerUserId AS [User Link]
     , COUNT(Id) AS Answers
    FROM Posts
    WHERE
        ParentId IS NOT NULL           -- ← Answers only
        AND OwnerUserId <> 0
    GROUP BY ParentId, OwnerUserId
    HAVING Count(id) > 1
    ORDER BY Answers DESC;

Context

StackExchange Code Review Q#43235, answer score: 6

Revisions (0)

No revisions yet.