patternsqlMinor
Calculate Total Visits
Viewed 0 times
totalcalculatevisits
Problem
I am trying to write a query where I have to calculate number of visits for a customer by taking care of overlapping days. Suppose for itemID 2009 start date is 23rd and end date is 26th therefore item 20010 is between these days we will not add this purchase date to our total count.
Example Scenario:
OutPut should be 7 VisitDays
Input Table:
I have tried so far:
```
CREATE TABLE #VisitsTable
(
StartDate DATETIME,
EndDate DATETIME
)
INSERT INTO #VisitsTable
SELECT DISTINCT
StartDate,
EndDate
FROM #Items items
WHERE CustID = 11205
ORDER BY StartDate ASC
IF EXISTS (SELECT TOP 1 1 FROM #VisitsTable)
BEGIN
SELECT ISNULL(SUM(VisitDays),1)
FROM ( SELECT DISTINCT
abc.StartDate,
abc.EndDate,
DATEDIFF(DD, abc.StartDate, abc.EndDate) + 1 VisitDays
FROM #VisitsTable abc
INNER JOIN #VisitsTable bc ON bc.StartDate NOT BETWEEN abc.StartDate AND ab
Example Scenario:
Item ID Start Date End Date Number of days Number of days Candidate for visit count
20009 2015-01-23 2015-01-26 4 4
20010 2015-01-24 2015-01-24 1 0
20011 2015-01-23 2015-01-26 4 0
20012 2015-01-23 2015-01-27 5 1
20013 2015-01-23 2015-01-27 5 0
20014 2015-01-29 2015-01-30 2 2OutPut should be 7 VisitDays
Input Table:
CREATE TABLE #Items
(
CustID INT,
ItemID INT,
StartDate DATETIME,
EndDate DATETIME
)
INSERT INTO #Items
SELECT 11205, 20009, '2015-01-23', '2015-01-26'
UNION ALL
SELECT 11205, 20010, '2015-01-24', '2015-01-24'
UNION ALL
SELECT 11205, 20011, '2015-01-23', '2015-01-26'
UNION ALL
SELECT 11205, 20012, '2015-01-23', '2015-01-27'
UNION ALL
SELECT 11205, 20012, '2015-01-23', '2015-01-27'
UNION ALL
SELECT 11205, 20012, '2015-01-28', '2015-01-29'I have tried so far:
```
CREATE TABLE #VisitsTable
(
StartDate DATETIME,
EndDate DATETIME
)
INSERT INTO #VisitsTable
SELECT DISTINCT
StartDate,
EndDate
FROM #Items items
WHERE CustID = 11205
ORDER BY StartDate ASC
IF EXISTS (SELECT TOP 1 1 FROM #VisitsTable)
BEGIN
SELECT ISNULL(SUM(VisitDays),1)
FROM ( SELECT DISTINCT
abc.StartDate,
abc.EndDate,
DATEDIFF(DD, abc.StartDate, abc.EndDate) + 1 VisitDays
FROM #VisitsTable abc
INNER JOIN #VisitsTable bc ON bc.StartDate NOT BETWEEN abc.StartDate AND ab
Solution
There are a lot of questions and articles about packing time intervals. For example, Packing Intervals by Itzik Ben-Gan.
You can pack your intervals for the given user. Once packed, there will be no overlaps, so you can simply sum up the durations of packed intervals.
If your intervals are dates without times, I'd use a
There are many ways to populate such a table.
For example, 100K rows (~270 years) from 1900-01-01:
See also Why are numbers tables "invaluable"?
Once you have a
Each original row is joined with the
Then we count distinct dates, which removes overlapping dates.
Result
You can pack your intervals for the given user. Once packed, there will be no overlaps, so you can simply sum up the durations of packed intervals.
If your intervals are dates without times, I'd use a
Calendar table. This table simply has a list of dates for several decades. If you do not have a Calendar table, simply create one:CREATE TABLE [dbo].[Calendar](
[dt] [date] NOT NULL,
CONSTRAINT [PK_Calendar] PRIMARY KEY CLUSTERED
(
[dt] ASC
));There are many ways to populate such a table.
For example, 100K rows (~270 years) from 1900-01-01:
INSERT INTO dbo.Calendar (dt)
SELECT TOP (100000)
DATEADD(day, ROW_NUMBER() OVER (ORDER BY s1.[object_id])-1, '19000101') AS dt
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);See also Why are numbers tables "invaluable"?
Once you have a
Calendar table, here is how to use it.Each original row is joined with the
Calendar table to return as many rows as there are dates between StartDate and EndDate.Then we count distinct dates, which removes overlapping dates.
SELECT COUNT(DISTINCT CA.dt) AS TotalCount
FROM
#Items AS T
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >= T.StartDate
AND dbo.Calendar.dt <= T.EndDate
) AS CA
WHERE T.CustID = 11205
;Result
TotalCount
7Code Snippets
CREATE TABLE [dbo].[Calendar](
[dt] [date] NOT NULL,
CONSTRAINT [PK_Calendar] PRIMARY KEY CLUSTERED
(
[dt] ASC
));INSERT INTO dbo.Calendar (dt)
SELECT TOP (100000)
DATEADD(day, ROW_NUMBER() OVER (ORDER BY s1.[object_id])-1, '19000101') AS dt
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);SELECT COUNT(DISTINCT CA.dt) AS TotalCount
FROM
#Items AS T
CROSS APPLY
(
SELECT dbo.Calendar.dt
FROM dbo.Calendar
WHERE
dbo.Calendar.dt >= T.StartDate
AND dbo.Calendar.dt <= T.EndDate
) AS CA
WHERE T.CustID = 11205
;TotalCount
7Context
StackExchange Database Administrators Q#130141, answer score: 8
Revisions (0)
No revisions yet.