HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Query for finding possible "Not An Answer" posts

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
answerquerypossibleforfindingpostsnot

Problem

I have this query in SEDE to try to find possible not an answers to flag. (I already went through a lot of them, for Stack Overflow, so I don't really recommend anyone else using this one since isn't that helpful anymore.)

As you can see, it looks to be a lot longer and redundant than it probably needs to be. Is there any way to shorten this?

```
SELECT
p.Id as [Post Link],
len(p.Body),
p.CreationDate as Date
FROM
Posts as p
JOIN
Users as u
ON
p.OwnerUserId = u.Id
JOIN
Posts as q
ON
p.ParentId = q.Id
WHERE
u.Reputation < 126
and
len(p.Body) < 300
and
p.Body LIKE '%a href=%'
and
p.Score < 3
and
p.PostTypeId = 2
and
p.CommentCount < 3
and
q.ClosedDate IS NULL
and
q.FavoriteCount < 3
and
q.CommunityOwnedDate IS NULL
and
q.Body NOT LIKE '%plugin%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%plugin%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Body NOT LIKE '%plug-in%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%plug-in%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Body NOT LIKE '%tool%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%tool%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%framework%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Body NOT LIKE '%framework%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%API%'
and
q.Body NOT LIKE '%API%'
and
q.Title NOT LIKE '%IDE%'
and
q.Body NOT LIKE '%IDE%'
and
q.Title NOT LIKE '%library%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Body NOT LIKE '%library%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%cms%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Body NOT LIKE '%cms%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Title NOT LIKE '%open source%' COLLATE SQL_Latin1_General_CP1_CI_AS
and
q.Body NOT LIKE '%open source%' COLLATE SQL_Latin1_General_CP1_CI_AS

Solution

Firstly, I would avoid using JOIN on it's own. INNER JOIN should be used instead, it's much easier to understand what is happening at a glance if you see INNER JOIN over JOIN

I would capitalise the AS and AND's, it just makes it more consistent.

Otherwise, the general formatting is pretty nice.

I would maybe argue that the table aliases are a little unnecessary, Posts and Users are already pretty short and are more meaningful than the single letter aliases you gave them. I would consider aliasing the joined Posts table Answers or something similar.

Lastly, the largest point. You are checking if both the body and title do not contain the same string. You can concatenate body and title together, with a space to separate them, and check for the word/phrase you are searching for in that instead, effectively halving the size of your where statement.

SELECT
Posts.Id AS [Post Link],
LEN(Posts.Body),
Posts.CreationDate AS Date

FROM Posts

INNER JOIN Users
ON Posts.OwnerUserId = Users.Id

INNER JOIN Posts AS Answers
ON Posts.ParentId = Answers.Id

WHERE Users.Reputation < 126
AND LEN(Posts.Body) < 300
AND Posts.Body LIKE '%a href=%'
AND Posts.Score < 3
AND Posts.PostTypeId = 2
AND Posts.CommentCount < 3
AND Answers.ClosedDate IS NULL
AND Answers.FavoriteCount < 3
AND Answers.CommunityOwnedDate IS NULL
AND Answers.Body+' '+Answers.Title NOT LIKE '%plugin%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%plug-in%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%tool%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%framework%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%API%'
AND Answers.Body+' '+Answers.Title NOT LIKE '%IDE%'
AND Answers.Body+' '+Answers.Title NOT LIKE '%library%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%cms%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%open source%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%application%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%software%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%tutorial%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%book%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%module%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%resource%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%engine%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%lightweight%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%opensource%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%technologies%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%simulator%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%compiler%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%editor%' COLLATE SQL_Latin1_General_CP1_CI_AS

ORDER BY LEN(Posts.Body) ASC


Here is the SEDE Query

Code Snippets

SELECT
Posts.Id AS [Post Link],
LEN(Posts.Body),
Posts.CreationDate AS Date

FROM Posts

INNER JOIN Users
ON Posts.OwnerUserId = Users.Id

INNER JOIN Posts AS Answers
ON Posts.ParentId = Answers.Id

WHERE Users.Reputation < 126
AND LEN(Posts.Body) < 300
AND Posts.Body LIKE '%a href=%'
AND Posts.Score < 3
AND Posts.PostTypeId = 2
AND Posts.CommentCount < 3
AND Answers.ClosedDate IS NULL
AND Answers.FavoriteCount < 3
AND Answers.CommunityOwnedDate IS NULL
AND Answers.Body+' '+Answers.Title NOT LIKE '%plugin%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%plug-in%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%tool%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%framework%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%API%'
AND Answers.Body+' '+Answers.Title NOT LIKE '%IDE%'
AND Answers.Body+' '+Answers.Title NOT LIKE '%library%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%cms%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%open source%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%application%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%software%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%tutorial%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%book%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%module%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%resource%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%engine%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%lightweight%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%opensource%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%technologies%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%simulator%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%compiler%' COLLATE SQL_Latin1_General_CP1_CI_AS
AND Answers.Body+' '+Answers.Title NOT LIKE '%editor%' COLLATE SQL_Latin1_General_CP1_CI_AS

ORDER BY LEN(Posts.Body) ASC

Context

StackExchange Code Review Q#62175, answer score: 3

Revisions (0)

No revisions yet.