HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

SQL - Inverse Selection

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
sqlinverseselection

Problem

I have two tables in my schema

students(id text, matric text)

attendance(id serial, a_date date, matric text references students matric)

To get records showing date and student present, I can run a query like

select a.a_date, s.matric 
from students s, attendance a 
where a.matric = s.matric 
and a.a_date between '2015-01-01' and '2015-04-01'


I want a query for the records showing date and student absent.

For more information, please visit http://sqlfiddle.com/#!15/e43a9/6 to view the schema and sample data

Solution

You first need a list of all possible dates in that range. This can be done using the generate_series() function in Postgres:

select dt::date dt
from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt


Using this you can create a list of all possible attendances:

with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad;


The above uses a common table expression to make the select a bit more readable (at least in my eyes). It can be seen as a "temporary view" that only exists for the runtime of the query. If you want, you can ignore that and just assume that there is a table called all_dates that contains all dates in the range you want.

The result of this can the be used to find those rows where not attendance row exists for a student on that date:

with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad
where not exists (select 1 
                  from attendance a
                  where a.matric = s.matric
                  and a.a_date = ad.dt);


The not exists subquery can also be re-written as a left join, which might be faster:

with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad
  left join attendance a on a.matric = s.matric and a.a_date = ad.dt
where a.a_date is null;


A third option is to "subtract" the list of attendances from the list of possible attendances. Sets in SQL can be subtracted using the except operator:

with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad

except 

select s.*, a.a_date
from student s  
  join attendance a on a.matric = s.matric;

Code Snippets

select dt::date dt
from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad;
with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad
where not exists (select 1 
                  from attendance a
                  where a.matric = s.matric
                  and a.a_date = ad.dt);
with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad
  left join attendance a on a.matric = s.matric and a.a_date = ad.dt
where a.a_date is null;
with all_dates as (
  select dt::date dt
  from generate_series(date '2015-01-01', date '2015-04-01', interval '1' day) dt
)
select s.*, ad.dt
from students s
  cross join all_dates ad

except 

select s.*, a.a_date
from student s  
  join attendance a on a.matric = s.matric;

Context

StackExchange Database Administrators Q#95259, answer score: 2

Revisions (0)

No revisions yet.