HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

RANK used in 'where' returns invalid column, but exists in results set

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
rankcolumnusedwherebutreturnsexistsresultsinvalidset

Problem

I basically have a table with date, timestamp, DID, coordinates.

I want a query that will return rows with the last coordinate from day X, and the first coord from day X+1, and the coordinates. So it will only return results where there are 2 consecutive dates.

This is what I came up with. Been trying to get this query to work, it's almost perfect, but I just need to add the commented out where conditions and it'll do exactly what I want. But when I uncomment, I get an error "column doesn't exist":

SELECT  a.timestamp_intersecting_date d1,b.timestamp_intersecting_date d2,   
        a."DID", 
        a.timestamp_intersecting_max t1, b.timestamp_intersecting_min t2,
        RANK () OVER ( 
          PARTITION BY a.timestamp_intersecting_date
          ORDER BY a.timestamp_intersecting_max DESC
       ) timestamp_d1_rank ,
        RANK () OVER ( 
          PARTITION BY b.timestamp_intersecting_date
          ORDER BY b.timestamp_intersecting_max ASC
       ) timestamp_d2_rank,
        a.coords_centroid, b.coords_centroid
FROM
    signals a
INNER JOIN signals b ON (a."DID" = b."DID")
WHERE (b.timestamp_intersecting_date = a.timestamp_intersecting_date + INTERVAL '1 DAY')
AND a."DID" = b."DID"
--AND timestamp_d1_rank = 1
--AND timestamp_d2_rank = 1
ORDER BY a."DID", t1 desc, t2 asc


How to solve this?

Solution

You cannot use aliases in WHERE clauses, nor can you use Window functions.

To demonstrate, consider this example (fiddle).

CREATE TABLE payment (amount INTEGER, pay_date DATE);

INSERT INTO payment VALUES (54,  '2019-09-01'), (56,  '2019-09-01'), (154, '2019-09-02'), 
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');


Then run the two following queries:

SELECT 
  *, 
  SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR:  column "the_sums" does not exist


and

SELECT 
  *, 
  SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR:  window functions are not allowed in WHERE


You didn't post your error message or PostgreSQL version, but your problem is that timestamp_d1_rank and timestamp_d2_rank are aliases.

You have two options:

  • First Option: use the aliases/Window functions within a subquery (also in fiddle)



SELECT * FROM
(
SELECT 
  *, 
  SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;


Result:

amount    pay_date  the_sums
    54  2019-09-01       110
    56  2019-09-01       110


  • Second Option: you can use a CTE (aka the WITH clause as outlined here or as in @ypercubeᵀᴹ 's answer above or also see the fiddle.



For a good explanation of this whole area, see here.

Code Snippets

CREATE TABLE payment (amount INTEGER, pay_date DATE);

INSERT INTO payment VALUES (54,  '2019-09-01'), (56,  '2019-09-01'), (154, '2019-09-02'), 
(156, '2019-09-02'), (254, '2019-09-03'), (256, '2019-09-03');
SELECT 
  *, 
  SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE the_sums > 200; -- ERROR:  column "the_sums" does not exist
SELECT 
  *, 
  SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
WHERE SUM(amount) OVER (PARTITION BY pay_date) > 200;
-- ERROR:  window functions are not allowed in WHERE
SELECT * FROM
(
SELECT 
  *, 
  SUM(amount) OVER (PARTITION BY pay_date) AS the_sums
FROM payment
) AS tab
WHERE the_sums < 200;
amount    pay_date  the_sums
    54  2019-09-01       110
    56  2019-09-01       110

Context

StackExchange Database Administrators Q#247275, answer score: 5

Revisions (0)

No revisions yet.