HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

What is the meaning of `(ORDER BY x RANGE BETWEEN n PRECEDING...)` if x is a date?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
meaningtheorderprecedingwhatrangedatebetween

Problem

In another thread:

https://stackoverflow.com/questions/37759659/db2-query-to-find-average-sale-for-each-item-1-year-previous

the OP wanted a sliding average for the last 365 days. Using ROWS BETWEEN ... would be fine if it where guaranteed that there where exactly one occurrence per day, but that is not the case here. RANGE BETWEEN ...seems like a good fit, but it is not clear to me what it means in DB2. Not sure if it matters that db2 does not have an INTERVAL type, but mimics it with labled durations.

The documentation says: (https://www.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0023461.html)

unsigned-constant PRECEDING

Specifies either the range or number of rows preceding the current row. If ROWS is specified, then unsigned-constant must be zero or a positive integer indicating a number of rows. If RANGE is specified, then the data type of unsigned-constant must be comparable to the type of the sort-key-expression of the window-order-clause. There can only be one sort-key-expression, and the data type of the sort-key-expression must allow subtraction. This clause cannot be specified in group-bound2 if group-bound1 is CURRENT ROW or unsigned-constant FOLLOWING.

unsigned-constant FOLLOWING

Specifies either the range or number of rows following the current row. If ROWS is specified, then unsigned-constant must be zero or a positive integer indicating a number of rows. If RANGE is specified, then the data type of unsigned-constant must be comparable to the type of the sort-key-expression of the window-order-clause. There can only be one sort-key-expression, and the data type of the sort-key-expression must allow addition.

DB2 allow constructions like:

values current_date - 1


the default unit date is day so this means:

values current_date - 1 day


Given this I would expect this example to work:

```
create table test
( d date not null
, x decimal(3,0) not null);

insert into test (d,x)
values (

Solution

Firstly, the expression values current_date - 1 would only be valid if Oracle compatibility mode were in effect -- it mimics the Oracle's datetime arithmetic where the default interval is expressed in (potentially fractional) days.

I think that regardless of Oracle compatibility, the range bounds should be comparable as integers, and comparing DATE values with integers might produce unexpected results. If you convert your DATEs to the number of days since some past moment you can use integer comparisons. You could use JULIAN_DAY(), for example:

select d, avg(x) over (order by julian_day(d) 
                   range between 30 preceding 
                             and current row) 
from test 
order by d


which produces the result you expect:

D          2                                
---------- ---------------------------------
01/01/2016   10.0000000000000000000000000000
01/07/2016   15.0000000000000000000000000000
01/12/2016   20.0000000000000000000000000000

  3 record(s) selected.


In the first fixpak of 10.5 it was allowed to use range over dates, but the results where unpredictable. In recent fixpaks this is no longer allowed, so much of the confusion in the question could have been avoided by using a recent fixpak.

Code Snippets

select d, avg(x) over (order by julian_day(d) 
                   range between 30 preceding 
                             and current row) 
from test 
order by d
D          2                                
---------- ---------------------------------
01/01/2016   10.0000000000000000000000000000
01/07/2016   15.0000000000000000000000000000
01/12/2016   20.0000000000000000000000000000

  3 record(s) selected.

Context

StackExchange Database Administrators Q#141263, answer score: 4

Revisions (0)

No revisions yet.