HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Calculate Total Visits

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
totalcalculatevisits

Problem

I am trying to write a query where I have to calculate number of visits for a customer by taking care of overlapping days. Suppose for itemID 2009 start date is 23rd and end date is 26th therefore item 20010 is between these days we will not add this purchase date to our total count.

Example Scenario:

Item ID Start Date   End Date   Number of days     Number of days Candidate for visit count
20009   2015-01-23  2015-01-26     4                      4
20010   2015-01-24  2015-01-24     1                      0
20011   2015-01-23  2015-01-26     4                      0
20012   2015-01-23  2015-01-27     5                      1
20013   2015-01-23  2015-01-27     5                      0
20014   2015-01-29  2015-01-30     2                      2


OutPut should be 7 VisitDays

Input Table:

CREATE TABLE #Items    
(
CustID INT,
ItemID INT,
StartDate DATETIME,
EndDate DATETIME
)           

INSERT INTO #Items
SELECT 11205, 20009, '2015-01-23',  '2015-01-26'  
UNION ALL 
SELECT 11205, 20010, '2015-01-24',  '2015-01-24'    
UNION ALL  
SELECT 11205, 20011, '2015-01-23',  '2015-01-26' 
UNION ALL  
SELECT 11205, 20012, '2015-01-23',  '2015-01-27'  
UNION ALL  
SELECT 11205, 20012, '2015-01-23',  '2015-01-27'   
UNION ALL  
SELECT 11205, 20012, '2015-01-28',  '2015-01-29'


I have tried so far:

```
CREATE TABLE #VisitsTable
(
StartDate DATETIME,
EndDate DATETIME
)

INSERT INTO #VisitsTable
SELECT DISTINCT
StartDate,
EndDate
FROM #Items items
WHERE CustID = 11205
ORDER BY StartDate ASC

IF EXISTS (SELECT TOP 1 1 FROM #VisitsTable)
BEGIN

SELECT ISNULL(SUM(VisitDays),1)
FROM ( SELECT DISTINCT
abc.StartDate,
abc.EndDate,
DATEDIFF(DD, abc.StartDate, abc.EndDate) + 1 VisitDays
FROM #VisitsTable abc
INNER JOIN #VisitsTable bc ON bc.StartDate NOT BETWEEN abc.StartDate AND ab

Solution

There are a lot of questions and articles about packing time intervals. For example, Packing Intervals by Itzik Ben-Gan.

You can pack your intervals for the given user. Once packed, there will be no overlaps, so you can simply sum up the durations of packed intervals.

If your intervals are dates without times, I'd use a Calendar table. This table simply has a list of dates for several decades. If you do not have a Calendar table, simply create one:

CREATE TABLE [dbo].[Calendar](
    [dt] [date] NOT NULL,
CONSTRAINT [PK_Calendar] PRIMARY KEY CLUSTERED 
(
    [dt] ASC
));


There are many ways to populate such a table.

For example, 100K rows (~270 years) from 1900-01-01:

INSERT INTO dbo.Calendar (dt)
SELECT TOP (100000) 
    DATEADD(day, ROW_NUMBER() OVER (ORDER BY s1.[object_id])-1, '19000101') AS dt
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);


See also Why are numbers tables "invaluable"?

Once you have a Calendar table, here is how to use it.

Each original row is joined with the Calendar table to return as many rows as there are dates between StartDate and EndDate.

Then we count distinct dates, which removes overlapping dates.

SELECT COUNT(DISTINCT CA.dt) AS TotalCount
FROM
    #Items AS T
    CROSS APPLY
    (
        SELECT dbo.Calendar.dt
        FROM dbo.Calendar
        WHERE
            dbo.Calendar.dt >= T.StartDate
            AND dbo.Calendar.dt <= T.EndDate
    ) AS CA
WHERE T.CustID = 11205
;


Result

TotalCount
7

Code Snippets

CREATE TABLE [dbo].[Calendar](
    [dt] [date] NOT NULL,
CONSTRAINT [PK_Calendar] PRIMARY KEY CLUSTERED 
(
    [dt] ASC
));
INSERT INTO dbo.Calendar (dt)
SELECT TOP (100000) 
    DATEADD(day, ROW_NUMBER() OVER (ORDER BY s1.[object_id])-1, '19000101') AS dt
FROM sys.all_objects AS s1 CROSS JOIN sys.all_objects AS s2
OPTION (MAXDOP 1);
SELECT COUNT(DISTINCT CA.dt) AS TotalCount
FROM
    #Items AS T
    CROSS APPLY
    (
        SELECT dbo.Calendar.dt
        FROM dbo.Calendar
        WHERE
            dbo.Calendar.dt >= T.StartDate
            AND dbo.Calendar.dt <= T.EndDate
    ) AS CA
WHERE T.CustID = 11205
;
TotalCount
7

Context

StackExchange Database Administrators Q#130141, answer score: 8

Revisions (0)

No revisions yet.