patternMinor
Recursive Join to get Spans
Viewed 0 times
recursivegetspansjoin
Problem
I am pretty new to working in SQL at and have a gaps and islands problem in which I am trying to get a series of consecutive months of membership. Any member may have multiple memberships and their membership may have gaps or overlap. My thought was to order memberships by member and plan and work backward within this group month by month until I find a gap, then assign the begin and end date of the membership.
Here's a sampling of my data:
The above data would represent memberships for two separate members to same/different plans for different time spans. Now, I need to get it to look more like this:
So far, I've got a recursive join that partitions based on Member and Plan and ranks them in descending order by YYYYMM, which seems to be working. But I'm stuck on how to figure out the begin and end dates for consecutive months of membership.
Edit: I'm using Oracle, not MySQL; Also, I only have read permission.
Edit: Sorry, NOT using MySQL or SQL Server
Here's a sampling of my data:
Member YYYYMM Plan
1 201601 123
1 201602 123
2 201505 123
2 201506 123
2 201510 123
2 201511 123
2 201512 123
1 201510 456
1 201511 456
1 201512 456
1 201601 456The above data would represent memberships for two separate members to same/different plans for different time spans. Now, I need to get it to look more like this:
Member YYYYMM_Begin YYYYMM_End Plan
1 201601 201602 123
2 201505 201506 123
2 201510 201512 123
1 201510 201601 456So far, I've got a recursive join that partitions based on Member and Plan and ranks them in descending order by YYYYMM, which seems to be working. But I'm stuck on how to figure out the begin and end dates for consecutive months of membership.
SELECT DISTINCT A.Member, A.Plan, A.YYYYMM, B.Rank
FROM Member_Hist A
JOIN (SELECT
MH.Member,
MH.Plan,
MH.YYYYMM,
RANK() OVER (PARTITION BY Member, Plan ORDER BY YYYYMM DESC) AS Rank
FROM Member_Hist MH) B ON A.Member = B.Member AND A.Plan = B.Plan
ORDER BY Member ASC, Plan ASC, YYYYMM DESCEdit: I'm using Oracle, not MySQL; Also, I only have read permission.
Edit: Sorry, NOT using MySQL or SQL Server
Solution
This is a Gap & Island problem. Plenty of sites talk about it if you Google it. Here is just the first link I clicked among many others: Solving Gaps and Islands with Enhanced Window Functions
Here is how the query works:
-
consecutive months will have consecutive
-
Query Oracle:
Oracle [SQL Fiddle][]
Query SQL Server:
SQL Server SQL Fiddle
Output:
Here is how the query works:
YYYYMMis a varchar and does not give consecutive numbers. Therefore it first changes them to proper date format.
- then
ROW_NUMBER()will number them by group ofMemberand[Plan]
-
consecutive months will have consecutive
rn and once rn is removed from the date, they will have identical values for DATEADD(month, rn, l1.dt)Member Plan dt rn => DATEADD(month, rn, l1.dt)
2 123 2015-12-01 1 => 2016-01-01
2 123 2015-11-01 2 => 2016-01-01
2 123 2015-10-01 3 => 2016-01-01
2 123 2015-06-01 4 => 2015-10-01
2 123 2015-05-01 5 => 2015-10-01-
DATEADD(month, rn, l1.dt) is finally used to group them along with Member and PlanQuery Oracle:
WITH dates AS(
SELECT DISTINCT Member, "Plan"
, dt = CAST(YYYYMM || '01' as date)
FROM data
), list AS (
SELECT *
, rn = ROW_NUMBER() OVER(PARTITION BY Member, "Plan" ORDER BY dt DESC)
FROM dates d
)
SELECT l1.Member
, MIN(l1.dt) as YYYYMM_Begin
, MAX(l1.dt) as YYYYMM_End
, l1."Plan"
FROM list l1
GROUP BY Member, "Plan", ADD_MONTHS(l1.dt, rn);Oracle [SQL Fiddle][]
Query SQL Server:
WITH dates AS(
SELECT DISTINCT Member, [Plan]
, dt = CAST(YYYYMM+'01' as date)
FROM @data
), list AS (
SELECT *
, rn = ROW_NUMBER() OVER(PARTITION BY Member, [Plan] ORDER BY dt DESC)
FROM dates d
)
SELECT l1.Member
, MIN(l1.dt) as YYYYMM_Begin
, MAX(l1.dt) as YYYYMM_End
, l1.[Plan]
FROM list l1
GROUP BY Member, [Plan], DATEADD(month, rn, l1.dt);SQL Server SQL Fiddle
Output:
Member YYYYMM_Begin YYYYMM_End Plan
1 2016-01-01 2016-02-01 123
1 2015-10-01 2016-01-01 456
2 2015-05-01 2015-06-01 123
2 2015-10-01 2015-12-01 123Code Snippets
Member Plan dt rn => DATEADD(month, rn, l1.dt)
2 123 2015-12-01 1 => 2016-01-01
2 123 2015-11-01 2 => 2016-01-01
2 123 2015-10-01 3 => 2016-01-01
2 123 2015-06-01 4 => 2015-10-01
2 123 2015-05-01 5 => 2015-10-01WITH dates AS(
SELECT DISTINCT Member, "Plan"
, dt = CAST(YYYYMM || '01' as date)
FROM data
), list AS (
SELECT *
, rn = ROW_NUMBER() OVER(PARTITION BY Member, "Plan" ORDER BY dt DESC)
FROM dates d
)
SELECT l1.Member
, MIN(l1.dt) as YYYYMM_Begin
, MAX(l1.dt) as YYYYMM_End
, l1."Plan"
FROM list l1
GROUP BY Member, "Plan", ADD_MONTHS(l1.dt, rn);WITH dates AS(
SELECT DISTINCT Member, [Plan]
, dt = CAST(YYYYMM+'01' as date)
FROM @data
), list AS (
SELECT *
, rn = ROW_NUMBER() OVER(PARTITION BY Member, [Plan] ORDER BY dt DESC)
FROM dates d
)
SELECT l1.Member
, MIN(l1.dt) as YYYYMM_Begin
, MAX(l1.dt) as YYYYMM_End
, l1.[Plan]
FROM list l1
GROUP BY Member, [Plan], DATEADD(month, rn, l1.dt);Member YYYYMM_Begin YYYYMM_End Plan
1 2016-01-01 2016-02-01 123
1 2015-10-01 2016-01-01 456
2 2015-05-01 2015-06-01 123
2 2015-10-01 2015-12-01 123Context
StackExchange Database Administrators Q#128036, answer score: 5
Revisions (0)
No revisions yet.