HiveBrain v1.2.0
Get Started
← Back to all entries
principlesqlMinor

Reference outer variables in nested select in bigquery vs mysql

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
referencebigquerynestedmysqlvariablesselectouter

Problem

I run following request in MySQL and it works, but results in an error with bigquery standard SQL.

$sql = = @xday AND v.date_time < @xdaytomorrow
      AND ads.id = vd.ad_id
    ) AS clicks
    FROM `PRFXad_reports` AS r
    INNER JOIN `PRFXads` AS ads
    ON r.ad_id = ads.id
    WHERE r.rdate = @xday AND ads.is_archive = 0
    GROUP BY ads.ad_id
    ORDER BY ads.ad_id ASC;
EOT;


Error message:

{
"error": {
"errors": [
{
"domain": "global",
"reason": "invalidQuery",
"message": "WHERE clause expression references ads.id which is neither grouped nor aggregated at [8:15]",
"locationType": "other",
"location": "query"
}
],
"code": 400,
"message": "WHERE clause expression references ads.id which is neither grouped nor aggregated at [8:15]"
}
}


If I replace ads.id with e.g. 380 no error is produced.

$sql = = @xday AND v.date_time < @xdaytomorrow
          AND 380 = vd.ad_id
        ) AS clicks
        FROM `PRFXad_reports` AS r
        INNER JOIN `PRFXads` AS ads
        ON r.ad_id = ads.id
        WHERE r.rdate = @xday AND ads.is_archive = 0
        GROUP BY ads.ad_id
        ORDER BY ads.ad_id ASC;
EOT;


Why can't I reference ads.id in the inner select from outer FROM clause in bigquery?

It seems mysql select random row to show ads.id during grouping by ads.ad_id.

Solution

Try removing the GROUP BY from the main query and converting the join to PRFXad_reports to an EXISTS subquery:

SELECT
    ads.ad_id,
    ( SELECT count(*)
      FROM `PRFXvisits` AS v
      INNER JOIN `PRFXvisit_data` AS vd
      ON vd.visit_id = v.id
      WHERE v.date_time >= @xday AND v.date_time < @xdaytomorrow
      AND ads.id = vd.ad_id
    ) AS clicks
FROM `PRFXads` AS ads
WHERE ads.is_archive = 0
  AND EXISTS
      ( SELECT 1
        FROM `PRFXad_reports` AS r
        WHERE r.ad_id = ads.id
          AND r.rdate = @xday
      )
ORDER BY ads.ad_id ;

Code Snippets

SELECT
    ads.ad_id,
    ( SELECT count(*)
      FROM `PRFXvisits` AS v
      INNER JOIN `PRFXvisit_data` AS vd
      ON vd.visit_id = v.id
      WHERE v.date_time >= @xday AND v.date_time < @xdaytomorrow
      AND ads.id = vd.ad_id
    ) AS clicks
FROM `PRFXads` AS ads
WHERE ads.is_archive = 0
  AND EXISTS
      ( SELECT 1
        FROM `PRFXad_reports` AS r
        WHERE r.ad_id = ads.id
          AND r.rdate = @xday
      )
ORDER BY ads.ad_id ;

Context

StackExchange Database Administrators Q#199128, answer score: 2

Revisions (0)

No revisions yet.