patternMinor
Total of orders open on a given date for each date in a date range
Viewed 0 times
totaleachrangeopengivendatefororders
Problem
This question is similar to Running total with count?, but please allow me to explain some further twists on the issue.
I'm using SQL Server 2008, so the cursor option described by Aaron Bertrand seems to be the most promising in terms of speed.
What's different here, though, is I have to take into account two dates for a single item. So, given an OrderID item within a table of Orders, the OrderID has an Opened Date and a Closed Date. The Opened Date is always populated, but the Closed Date could be
I need to - efficiently - get back how many Orders had what we could term an "Open" status on any given date within a date range. The date range could span a couple of years.
(Yes, I do have a reference table of sequential dates.)
I'm using SQL Server 2008, so the cursor option described by Aaron Bertrand seems to be the most promising in terms of speed.
What's different here, though, is I have to take into account two dates for a single item. So, given an OrderID item within a table of Orders, the OrderID has an Opened Date and a Closed Date. The Opened Date is always populated, but the Closed Date could be
NULL.OrderID OpenedDate ClosedDate
654554 12/1/2011 5/4/2012
678451 12/4/2011 3/2/2012
679565 12/8/2011 5/21/2012
701541 5/23/2012 NULL
...I need to - efficiently - get back how many Orders had what we could term an "Open" status on any given date within a date range. The date range could span a couple of years.
(Yes, I do have a reference table of sequential dates.)
Date CountOfOpenOrders
12/1/2011 175
12/2/2011 178
12/3/2011 195
12/4/2011 192
12/5/2011 191
...Solution
If your priority is speed of selects, the following approach allows for extremely fast selects. Instead of storing a period, you want to store two events (period start and period end). Change column is 1 when the period begins, and is -1 when the period ends. If more than one event occurs on the same day, they must have different EventNumberPerDay. RunningTotal is the number of open peroids after the event has happened:
Also you need a calendar table:
Once that is accomplished, then your select is very simple and very fast:
Of course, this query is only correct if the data in Events table is valid. We can use constraints to ensure 100% data integrity. I can explain how if you are interested.
Another alternative is to just load your raw data, your periods, into a client application - your problem is absolutely trivial in C++/C#/Java.
Yet another approach is to use an RDBMS with fast cursors such as Oracle - that will allow you to just write a simple cursor and enjoy good performance, but still not always as good as my first solution.
CREATE TABLE dbo.Events(
PeriodId INT NOT NULL,
Change SMALLINT NOT NULL,
ChangeDate DATE NOT NULL,
EventNumberPerDay INT NOT NULL,
RunningTotal INT NOT NULL);
GO
INSERT dbo.Events
( PeriodId ,
Change ,
ChangeDate ,
EventNumberPerDay,
RunningTotal
)
-- first period begins
VALUES ( 1 , 1, '20120801', 1, 1),
-- second period begins on the same day
(2, 1, '20120801', 2, 2),
-- third period begins
(3,1,'20120803',1, 3),
-- second period ends on the same day
(2,-1,'20120803',2, 2),
-- fourth period begins
(4,1,'20120804',1,3),
-- fourth period ends
(4,-1,'20120805',1,2),
-- first period ends
(1, -1, '20120808',1, 1),
-- third period ends
(3, -1, '20120809',1, 0);
GOAlso you need a calendar table:
CREATE TABLE dbo.Calendar([Date] DATE NOT NULL);
GO
INSERT INTO dbo.Calendar([Date])
VALUES('20120801'),
('20120802'),
('20120803'),
('20120804'),
('20120805'),
('20120806'),
('20120807'),
('20120808'),
('20120809'),
('20120810'),
('20120811');Once that is accomplished, then your select is very simple and very fast:
SELECT [Date] ,
coalesce(RunningTotal, 0) AS NumOpenIntervals
FROM dbo.Calendar
OUTER APPLY ( SELECT TOP ( 1 )
RunningTotal
FROM dbo.Events
WHERE ChangeDate <= [Date]
ORDER BY ChangeDate DESC, EventNumberPerDay DESC
) AS t
ORDER BY [Date]Of course, this query is only correct if the data in Events table is valid. We can use constraints to ensure 100% data integrity. I can explain how if you are interested.
Another alternative is to just load your raw data, your periods, into a client application - your problem is absolutely trivial in C++/C#/Java.
Yet another approach is to use an RDBMS with fast cursors such as Oracle - that will allow you to just write a simple cursor and enjoy good performance, but still not always as good as my first solution.
Code Snippets
CREATE TABLE dbo.Events(
PeriodId INT NOT NULL,
Change SMALLINT NOT NULL,
ChangeDate DATE NOT NULL,
EventNumberPerDay INT NOT NULL,
RunningTotal INT NOT NULL);
GO
INSERT dbo.Events
( PeriodId ,
Change ,
ChangeDate ,
EventNumberPerDay,
RunningTotal
)
-- first period begins
VALUES ( 1 , 1, '20120801', 1, 1),
-- second period begins on the same day
(2, 1, '20120801', 2, 2),
-- third period begins
(3,1,'20120803',1, 3),
-- second period ends on the same day
(2,-1,'20120803',2, 2),
-- fourth period begins
(4,1,'20120804',1,3),
-- fourth period ends
(4,-1,'20120805',1,2),
-- first period ends
(1, -1, '20120808',1, 1),
-- third period ends
(3, -1, '20120809',1, 0);
GOCREATE TABLE dbo.Calendar([Date] DATE NOT NULL);
GO
INSERT INTO dbo.Calendar([Date])
VALUES('20120801'),
('20120802'),
('20120803'),
('20120804'),
('20120805'),
('20120806'),
('20120807'),
('20120808'),
('20120809'),
('20120810'),
('20120811');SELECT [Date] ,
coalesce(RunningTotal, 0) AS NumOpenIntervals
FROM dbo.Calendar
OUTER APPLY ( SELECT TOP ( 1 )
RunningTotal
FROM dbo.Events
WHERE ChangeDate <= [Date]
ORDER BY ChangeDate DESC, EventNumberPerDay DESC
) AS t
ORDER BY [Date]Context
StackExchange Database Administrators Q#23663, answer score: 4
Revisions (0)
No revisions yet.