patternsqlMinor
Calculate number of consecutive years (not gapless)
Viewed 0 times
numbercalculateyearsnotgaplessconsecutive
Problem
I have a request that wants to look at numbers of consecutive year buyers. For example, I have a table like this:
I want to calculate for each year how many consecutive order years each
The output would be:
Or if you just want to calculate the max order year with the max consecutive years, that's fine, such as:
Order_year, Contact
2015, 123
2014, 123
2013, 123
2011, 123
2010, 123
2015, 234
2014, 234
2011, 234I want to calculate for each year how many consecutive order years each
contact_id has. The problem is, there are gaps such as contact_id = 123, it has 2010-2011, then 2013-2015.The output would be:
report_year, contact, consecutive_years
2015, 123, 3
2014, 123, 2
2013, 123, 1
2012, 123, 0
2011, 123, 2
2010, 123, 1
2015, 234, 3
2014, 234, 2
2013, 234, 1
2012, 234, 0
2011, 234, 1Or if you just want to calculate the max order year with the max consecutive years, that's fine, such as:
order_year, contact, consecutive_years
2015, 123, 3
2011, 123, 2
2015, 234, 2Solution
So I've built a short SQL script to do this operation, based on the original sample input data:
The script is as follows:
Now, for a little explanation:
Recusive Query
First, you need to find all the occasions of consecutive years, so I built the
Filter Results
So, due to the recursive query, I am left with all counted elements, rather than simply the maximum consecutive counts. So, I built
A little arithmetic...
This still doesn't quite yield what we need. Since we grouped over
What you'll notice is that you have unique, information which you can group on by observing the starting year of the consecutive years. So, I created
Finally, I get my result by adding the
Note that it would be very easy to filter out for cases
Now for the caveats: I wrote this series of CTEs for ease of readability and explanation. There may be a more computationally or programatically efficient way to achieve your results, but on even a moderately sized table this should get you your results pretty quick.
Order_year, Contact
2015, 123
2014, 123
2013, 123
2011, 123
2010, 123
2015, 234
2014, 234
2011, 234The script is as follows:
WITH RECURSIVE ranked_order AS(
SELECT order_year, contact, 1 AS counter FROM orders
UNION
SELECT orders.order_year, orders.contact, ranked_order.counter + 1
FROM orders JOIN ranked_order
ON orders.order_year-1 = ranked_order.order_year
WHERE orders.contact = ranked_order.contact
),
max_from_ranked_order AS(
SELECT order_year AS report_year, contact, max(counter) AS consecutive_year
FROM ranked_order
GROUP BY order_year, contact
),
start_and_consecutive_years AS(
SELECT contact, max(consecutive_year) AS consecutive_years,
report_year - consecutive_year AS start_year
FROM max_from_ranked_order
GROUP BY contact, start_year)
SELECT *, start_year + consecutive_years AS end_year
FROM start_and_consecutive_years
ORDER BY contact;Now, for a little explanation:
Recusive Query
First, you need to find all the occasions of consecutive years, so I built the
ranked_order table(maybe a bad name... maybe should have called it counted_order?) to count instances where a given contact ordered in back-to-back years.Filter Results
So, due to the recursive query, I am left with all counted elements, rather than simply the maximum consecutive counts. So, I built
max_from_ranked_order to extract the max counter value per order_year,contact grouping.A little arithmetic...
This still doesn't quite yield what we need. Since we grouped over
order_year,contact, you still end up with entries likereport_year, contact, consectuive_year
2015, 123, 3
2014, 123, 2
2013, 123, 1
2015, 234, 2
2014, 234, 1
etc...What you'll notice is that you have unique, information which you can group on by observing the starting year of the consecutive years. So, I created
start_and_consecutive_years to take the maximum number of consecutive purchasing years for any given starting year, on a per customer basis.Finally, I get my result by adding the
consecutive_years to the start_year, and presto, the final SELECT yields:contact, consecutive_years, start_year, end_year
123, 3, 2012, 2015
123, 2, 2009, 2011
234, 2, 2013, 2015
234, 1, 2010, 2011Note that it would be very easy to filter out for cases
WHERE consecutive years >= 2 to ignore "irrelevant" results.Now for the caveats: I wrote this series of CTEs for ease of readability and explanation. There may be a more computationally or programatically efficient way to achieve your results, but on even a moderately sized table this should get you your results pretty quick.
Code Snippets
Order_year, Contact
2015, 123
2014, 123
2013, 123
2011, 123
2010, 123
2015, 234
2014, 234
2011, 234WITH RECURSIVE ranked_order AS(
SELECT order_year, contact, 1 AS counter FROM orders
UNION
SELECT orders.order_year, orders.contact, ranked_order.counter + 1
FROM orders JOIN ranked_order
ON orders.order_year-1 = ranked_order.order_year
WHERE orders.contact = ranked_order.contact
),
max_from_ranked_order AS(
SELECT order_year AS report_year, contact, max(counter) AS consecutive_year
FROM ranked_order
GROUP BY order_year, contact
),
start_and_consecutive_years AS(
SELECT contact, max(consecutive_year) AS consecutive_years,
report_year - consecutive_year AS start_year
FROM max_from_ranked_order
GROUP BY contact, start_year)
SELECT *, start_year + consecutive_years AS end_year
FROM start_and_consecutive_years
ORDER BY contact;report_year, contact, consectuive_year
2015, 123, 3
2014, 123, 2
2013, 123, 1
2015, 234, 2
2014, 234, 1
etc...contact, consecutive_years, start_year, end_year
123, 3, 2012, 2015
123, 2, 2009, 2011
234, 2, 2013, 2015
234, 1, 2010, 2011Context
StackExchange Database Administrators Q#104485, answer score: 3
Revisions (0)
No revisions yet.