HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Looking for a simpler alternative to a recursive query

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
alternativequerysimplerlookingrecursivefor

Problem

The actual query is more involved, but the problem I'm facing can be distilled to this:

A query to filter a rowset of monotonically increasing integers so that - in the final result set, row(n+1).value >= row(n).value + 5.

For the actual problem I need to solve, the rowset count is in the 1000s.

A few examples to clarify:

  • if rows are: 1,2,3,4,5 : then query should return: 1



  • if rows are: 1,5,7,10,11,12,13 : then query should return: 1,7,12



  • if rows are: 6,8,11,16,20,23: then query should return: 6,11,16,23



  • if rows are: 6,8,12,16,20,23: then query should return: 6,12,20



I've managed to get the required results with the following query, but it seems overly complicated. Uncomment the different "..with t(k).." to try them out.

I'm looking for any simplifications or alternative approaches to get the same results.

with recursive r(n, pri) as (
    with t(k) as (values (1),(2),(3),(4),(5))   -- the data we want to filter
    -- with t(k) as (values (1),(5),(7),(10),(11),(12),(13))
    -- with t(k) as (values (6),(8),(11),(16),(20),(23))
    -- with t(k) as (values (6),(8),(12),(16),(20),(23))
    select min(k), 1::bigint from t             -- bootstrap for recursive processing. 1 here represents rank().
    UNION
    select k, (rank() over(order by k)) rr      -- rank() is required just to filter out the rows we dont want from the final result set, and no other reason
    from r, t 
    where t.k >= r.n+5 and r.pri = 1            -- capture the rows we want, AND unfortunately a bunch of rows we dont want 
)
select n from r where pri = 1;

Solution

This was hard! I don't know if this is simpler, but at least it doesn't use window function nor produce rows that require being filtered out.

with recursive r(k, n) as (
    with t(k) as (values (1),(2),(3),(4),(5))   -- the data we want to filter
    -- with t(k) as (values (1),(5),(7),(10),(11),(12),(13))
    -- with t(k) as (values (6),(8),(11),(16),(20),(23))
    -- with t(k) as (values (6),(8),(12),(16),(20),(23))
         ,t2(k,n) AS (select k, (select min(k) from t tt where k >= t.k+5) from t) -- precalculate what's next
    select * from (select * from t2 limit 1) x   -- limit 1 directly fails in a union!
    UNION ALL
    select t2.* from r, t2 where t2.k = r.n      -- on each iteration, keep only the value that matches the previous precalculated next one
)
select k from r


Testing

This alternative seems to be less efficient for very small sets, but more or less linear in performance, whilst the original seems to be exponentially more sluggish.

drop table if exists t;
create temp table t(k) AS
with recursive r(n) as (
  select floor(random()*10)::int + 1
  UNION ALL
  select n + floor(random()*10)::int + 1
  from r
  where n = r.n+5 and r.pri = 1
)
select count(*) from r where pri = 1; -- I aborted it after waiting for a minute

with recursive r(k, n) as (
    with t2(k,n) AS (select k, (select min(k) from t tt where k >= t.k+5) from t)
    select * from (select * from t2 limit 1) x
    UNION ALL
    select t2.* from r, t2 where t2.k = r.n
)
select count(*) from r -- 26" in my server

Code Snippets

with recursive r(k, n) as (
    with t(k) as (values (1),(2),(3),(4),(5))   -- the data we want to filter
    -- with t(k) as (values (1),(5),(7),(10),(11),(12),(13))
    -- with t(k) as (values (6),(8),(11),(16),(20),(23))
    -- with t(k) as (values (6),(8),(12),(16),(20),(23))
         ,t2(k,n) AS (select k, (select min(k) from t tt where k >= t.k+5) from t) -- precalculate what's next
    select * from (select * from t2 limit 1) x   -- limit 1 directly fails in a union!
    UNION ALL
    select t2.* from r, t2 where t2.k = r.n      -- on each iteration, keep only the value that matches the previous precalculated next one
)
select k from r
drop table if exists t;
create temp table t(k) AS
with recursive r(n) as (
  select floor(random()*10)::int + 1
  UNION ALL
  select n + floor(random()*10)::int + 1
  from r
  where n < 100000)        -- change to increase or reduce set
select * from r;           -- surprisingly fast! Go PG!
create index on t(k);

with recursive r(n, pri) as (
    select min(k), 1::bigint from t
    UNION
    select k, (rank() over(order by k)) rr
    from r, t 
    where t.k >= r.n+5 and r.pri = 1
)
select count(*) from r where pri = 1; -- I aborted it after waiting for a minute

with recursive r(k, n) as (
    with t2(k,n) AS (select k, (select min(k) from t tt where k >= t.k+5) from t)
    select * from (select * from t2 limit 1) x
    UNION ALL
    select t2.* from r, t2 where t2.k = r.n
)
select count(*) from r -- 26" in my server

Context

StackExchange Database Administrators Q#134723, answer score: 2

Revisions (0)

No revisions yet.