HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Find first questions with answers posted within 24 hours

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
withanswersquestionshourswithinfirstfindposted

Problem

So, sᴉɔuɐɹɥԀ suggested I write a SEDE query to find first questions with answers posted within 24 hours, and here it is:

WITH QData(OwnerId, CDate)
AS
(
  SELECT
  Q.OwnerUserId [OwnerId]
  ,MIN(Q.CreationDate) [CDate]
  FROM Posts Q
  WHERE Q.PostTypeId = 1 AND Q.OwnerUserId IS NOT NULL
  GROUP BY Q.OwnerUserId
),

FirstQs(QCreationDate, QPostId, QUserId)
AS
(
  SELECT
  CDate [QCreationDate]
  ,Q.Id [QPostId]
  ,OwnerId [QUserId]
  From QData JOIN Posts Q On (Q.CreationDate = CDate AND Q.OwnerUserId = OwnerId)
  WHERE Q.PostTypeId = 1
  GROUP BY OwnerId, Q.Id, CDate
),

FirstQsA(ACreationDate, APostId, AParentId)
AS
(
  SELECT
  MIN(p.CreationDate) [ACreationDate]
  ,MIN(p.Id) [APostId]
  ,p.ParentId [AParentId]
  FROM Posts p, FirstQs
  WHERE (p.PostTypeId = 2 AND p.ParentId = QPostId)
  GROUP BY p.ParentId
)

SELECT QUserId [User Link]
,MIN(QCreationDate) [QCreationDate]
,MIN(ACreationDate) [ACreationDate]
,QPostId[Post Link]
,APostId[Post Link]
FROM
  FirstQs JOIN FirstQsA ON QPostId = AParentId
  WHERE DATEDIFF(HOUR, QCreationDate, ACreationDate) <= 24
GROUP BY QUserId, QPostId, APostId
ORDER BY QUserId


The query can be found here: First Questions with Answers Posted Within 24 Hours.

Solution

I have to compliment you on how well your query reads. Your aliases are short enough not to be a PITA to reference, yet descriptive enough to not have to scratch your head about what they mean.

Your CTEs are well formatted, pretty easy to read and have good names. Overall, if I was a DBA and saw that in my code base, I'd think the person who wrote it put some work into making it good.

Well done.

Now for a few things to make it even better.

Magic Numbers

Here:

WHERE Q.PostTypeId = 1
-- and later
WHERE (p.PostTypeId = 2


1 and 2 are primary keys in the PostTypes table. This is fine for someone who knows the schema already, but for someone who sees this for the first time, they will have to query the table with those keys to know what they actually mean. So, I recommend using variables, like so:

DECLARE @QuestionPostType INT; 
SET @QuestionPostType = 1;
DECLARE @AnswerPostType INT; 
SET @AnswerPostType = 2;


Then just reference them by name in your query:

WHERE Q.PostTypeId = @QuestionPostType
-- and later
WHERE (p.PostTypeId = @AnswerPostType


Proliferation of CTEs

It's easy to get lost in data sets in SQL, and while yours works fine, I think it could be simplified a bit. Take for instance:

WITH QData(OwnerId, CDate)
AS
(
  SELECT
  Q.OwnerUserId [OwnerId]
  ,MIN(Q.CreationDate) [CDate]
  FROM Posts Q
  WHERE Q.PostTypeId = 1 AND Q.OwnerUserId IS NOT NULL
  GROUP BY Q.OwnerUserId
),


You have a whole CTE following to extract basically this:

(Q.CreationDate = CDate AND Q.OwnerUserId = OwnerId)


You could just adapt that in your first CTE and simplify it, kind of like this:

WITH QData(OwnerId, CDate)
AS
(
  SELECT
  Q.OwnerUserId [OwnerId]
  ,MIN(Q.CreationDate) [CDate]
  FROM Posts Q
  WHERE Q.PostTypeId = 1 AND Q.OwnerUserId IS NOT NULL
  AND (Q.CreationDate = CDate AND Q.OwnerUserId = OwnerId)
  GROUP BY Q.OwnerUserId
),


Take advantage of joins

Many think of joins as basically just linking keys between one table and another. But, you can use joins to filter data, and keep related conditions/filters together in one join, rather than splitting them up in a bunch of CTEs or where clauses.

For example, you could extract some of the Q & A logic into a join. Whether or not it is desirable, it-depends. But for the sake of example, this:

FROM
  FirstQs JOIN FirstQsA ON QPostId = AParentId


...along with your other filters which are included in your CTEs, could be something like:

FROM
  FirstQs 
  JOIN FirstQsA 
    ON QPostId = AParentId
   AND Q.PostTypeId = 1 -- or @QuestionPostType
   AND A.PostTypeId = 2 -- or @AnswerPostType

Code Snippets

WHERE Q.PostTypeId = 1
-- and later
WHERE (p.PostTypeId = 2
DECLARE @QuestionPostType INT; 
SET @QuestionPostType = 1;
DECLARE @AnswerPostType INT; 
SET @AnswerPostType = 2;
WHERE Q.PostTypeId = @QuestionPostType
-- and later
WHERE (p.PostTypeId = @AnswerPostType
WITH QData(OwnerId, CDate)
AS
(
  SELECT
  Q.OwnerUserId [OwnerId]
  ,MIN(Q.CreationDate) [CDate]
  FROM Posts Q
  WHERE Q.PostTypeId = 1 AND Q.OwnerUserId IS NOT NULL
  GROUP BY Q.OwnerUserId
),
(Q.CreationDate = CDate AND Q.OwnerUserId = OwnerId)

Context

StackExchange Code Review Q#87111, answer score: 3

Revisions (0)

No revisions yet.