HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Finding the Min value for a subgroup, filtering out preceding Min values within the same parent group

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
sametheprecedinggroupsubgroupparentvalueminwithinfor

Problem

How do I get the minimum value for a group (Acc, TranType), but filtering out any minimum values used in preceding rows for the Acc group. Preceding rows would be defined as Acc asc, TranType asc.

The PosCancelID should only appear once per Acc group. But the same PosCancelID could appear within another Acc Group within the data set.

So with the given data set:

Acc  | TranType | PosCancelID
100     1               2
808     1               5
808     1               4
808     2               5
808     2               4<--To be filtered from min calc as it min for (808,1)
813     2               3
813     4               3<--To be filtered from min calc as it min for (813,2)
809     1               3
809     1               4
809     2               3<--To be filtered from min calc as (809,1) uses it
809     2               4 
809     3               4<--To be filtered from min calc as (809,2) uses it


I should get:

Acc  | TranType | PosCancelID
100     1               2
808     1               4
808     2               5
813     2               3
809     1               3
809     2               4

SELECT ACC, TranType, min(maxPreceeding) as ActualCancelID 
FROM 
(
   SELECT ACC, TranType,  
          MAX(m.posCancelID) OVER (PARTITION BY m.ACC 
                                  ORDER BY m.TranType, m.posCancelID 
                                  ROWS UNBOUNDED PRECEDING) as maxPreceeding
    FROM MCancel as m
) AS x
GROUP BY ACC, TranType


The above query gives me almost what i want but isn't filtering for acc = 813. So I know there must be a better (actually applying a filter to remove previous minimum values) way.

Solution

Quite a hard problem. Here is a recursive solution:

WITH 
  rcte AS
  ( SELECT TOP (1)
        Acc, TranType, posCancelID,
        CAST('=' + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX)) AS IDs 
    FROM
        MCancel
    ORDER BY
        Acc, TranType, posCancelID

    UNION ALL

    SELECT
        Acc, TranType, posCancelID,
        CAST(IDs + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX))
    FROM 
      ( SELECT
            m.*, 
            r.IDs, 
            ROW_NUMBER() OVER (ORDER BY m.Acc, m. TranType, m.PosCancelID) AS rn
        FROM
            rcte AS r 
            JOIN MCancel AS m
                ON  (m.Acc = r.Acc AND m.TranType > r.TranType)
                OR  (m.Acc > r.Acc)
        WHERE
            r.IDs NOT LIKE ('%=' + CAST(m.posCancelID AS VARCHAR(20)) + '=%')
      ) AS mc
    WHERE
        rn = 1
  )
SELECT Acc, TranType, posCancelID
FROM rcte
ORDER BY Acc, TranType ;


The solution assumes that a posCancelID should not appear twice in the result set. If the requirements are that they should not appear twice in the same Acc group, then the solution needs a slight adjustment:

WITH rcte AS
  ( SELECT 
      Acc, TranType, posCancelID,
      CAST('=' + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX)) AS IDs 
    FROM
      ( SELECT 
          Acc, TranType, posCancelID,
          ROW_NUMBER() OVER (PARTITION BY Acc ORDER BY TranType, PosCancelID) AS rnk
        FROM MCancel
      ) AS f
    WHERE rnk = 1

    UNION ALL

    SELECT
      Acc, TranType, posCancelID,
      CAST(IDs + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX))
    FROM
      ( SELECT
          m.*, r.IDs, 
          ROW_NUMBER() OVER (PARTITION BY m.Acc
                             ORDER BY m.TranType, m.PosCancelID) AS rn
        FROM
          rcte AS r 
          JOIN MCancel AS m
              ON  (m.Acc = r.Acc AND m.TranType > r.TranType)
        WHERE
          r.IDs NOT LIKE ('%=' + CAST(m.posCancelID AS VARCHAR(20)) + '=%')
      ) AS mc
    WHERE rn = 1
  )
SELECT Acc, TranType, posCancelID
FROM rcte
ORDER BY Acc, TranType ;


Both are tested at dbfiddle.uk

Code Snippets

WITH 
  rcte AS
  ( SELECT TOP (1)
        Acc, TranType, posCancelID,
        CAST('=' + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX)) AS IDs 
    FROM
        MCancel
    ORDER BY
        Acc, TranType, posCancelID

    UNION ALL

    SELECT
        Acc, TranType, posCancelID,
        CAST(IDs + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX))
    FROM 
      ( SELECT
            m.*, 
            r.IDs, 
            ROW_NUMBER() OVER (ORDER BY m.Acc, m. TranType, m.PosCancelID) AS rn
        FROM
            rcte AS r 
            JOIN MCancel AS m
                ON  (m.Acc = r.Acc AND m.TranType > r.TranType)
                OR  (m.Acc > r.Acc)
        WHERE
            r.IDs NOT LIKE ('%=' + CAST(m.posCancelID AS VARCHAR(20)) + '=%')
      ) AS mc
    WHERE
        rn = 1
  )
SELECT Acc, TranType, posCancelID
FROM rcte
ORDER BY Acc, TranType ;
WITH rcte AS
  ( SELECT 
      Acc, TranType, posCancelID,
      CAST('=' + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX)) AS IDs 
    FROM
      ( SELECT 
          Acc, TranType, posCancelID,
          ROW_NUMBER() OVER (PARTITION BY Acc ORDER BY TranType, PosCancelID) AS rnk
        FROM MCancel
      ) AS f
    WHERE rnk = 1

    UNION ALL

    SELECT
      Acc, TranType, posCancelID,
      CAST(IDs + CAST(posCancelID AS VARCHAR(20)) + '=' AS VARCHAR(MAX))
    FROM
      ( SELECT
          m.*, r.IDs, 
          ROW_NUMBER() OVER (PARTITION BY m.Acc
                             ORDER BY m.TranType, m.PosCancelID) AS rn
        FROM
          rcte AS r 
          JOIN MCancel AS m
              ON  (m.Acc = r.Acc AND m.TranType > r.TranType)
        WHERE
          r.IDs NOT LIKE ('%=' + CAST(m.posCancelID AS VARCHAR(20)) + '=%')
      ) AS mc
    WHERE rn = 1
  )
SELECT Acc, TranType, posCancelID
FROM rcte
ORDER BY Acc, TranType ;

Context

StackExchange Database Administrators Q#225548, answer score: 6

Revisions (0)

No revisions yet.