HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlMajor

Postgres JOIN conditions vs WHERE conditions

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
joinwherepostgresconditions

Problem

Postgres newbie here.

I'm wondering if this query is optimized or not? I tried to JOIN ON only the values that are 100% necessary and leaving all the dynamic conditions in the WHERE clause. See below.

SELECT *
    FROM
      myapp_employees
    JOIN myapp_users ON
      myapp_users.user_id=myapp_employees.user_id
    JOIN myapp_contacts_assoc ON
      myapp_contacts_assoc.user_id=myapp_users.user_id
    JOIN myapp_contacts ON
      myapp_contacts.contact_id=myapp_contacts_assoc.contact_id
    WHERE
      myapp_contacts.value='test@gmail.com' AND
      myapp_contacts.type=(1)::INT2 AND
      myapp_contacts.is_primary=(1)::INT2 AND
      myapp_contacts.expired_at IS NULL AND
      myapp_employees.status=(1)::INT2 AND
      myapp_users.status=(1)::INT2
    LIMIT 1;


Note: For context, this proc is checking to see if a user is also an employee (elevated privs/different user type).

Anyways, is this the right way to go? Should the JOIN ON contain more statements like checking for expired_at IS NULL, for example? Why or why doesn't this make sense?

Solution

Logically, it makes no difference at all whether you place conditions in the join clause of an INNER JOIN or the WHERE clause of the same SELECT. The effect is the same.

(Not the case for OUTER JOIN!)

While operating with default settings it also makes no difference for the query plan or performance. Postgres is free to rearrange predicates in JOIN & WHERE clauses in its quest for the best query plan - as long as the number of tables is not greater than the join_collapse_limit (default 8). Details:

  • Complex view becomes slow when adding ORDER BY in outer query with small LIMIT



For readability and maintainability it makes sense to place conditions that connect tables in the respective JOIN clause and general conditions in the WHERE clause.

Your query looks just fine. I would use table aliases to cut back the noise, though.

Minor detail:

int2 '1' or even 1::int2 are more sensible than (1)::INT2. And while comparing to a value of well defined numeric data type, a plain numerical constant 1 is good enough, too.

Context

StackExchange Database Administrators Q#155972, answer score: 30

Revisions (0)

No revisions yet.