patternsqlMinor

Best Way to Index Multiple Effective Date Columns

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

columnswaydatemultipleindexeffectivebest

Problem

I've got a table that is primarily to represent relationships between entities (i.e. mostly composed of foreign keys). These relationships change over time, and so the table has a StartDate and an EndDate column. I now need to add another dimension of start and end dates, meaning that the relationship can be viewed with two different date "lenses" (queried with two dates, @Date1 and @Date2), so the schema will look something like this:

MyJoinTable:

| Id | Entity1Id | Entity2Id | StartDate1 |  EndDate1  | StartDate2 |  EndDate2  |
|----|-----------|-----------|------------|------------|------------|------------|
|  1 | A         | B         | 1753-01-01 | 2018-09-01 | 1753-01-01 | 2025-01-01 |
|  2 | A         | B         | 2018-09-01 | 2018-10-01 | 1753-01-01 | 2018-11-01 |
|  3 | A         | C         | 2018-09-01 | 2018-10-01 | 2018-11-01 | 2025-01-01 |
|  4 | A         | B         | 2018-10-01 | 2025-01-01 | 1753-01-01 | 2018-11-01 |
|  5 | A         | D         | 2018-10-01 | 2025-10-01 | 2018-11-01 | 2025-01-01 |

The queries will mostly be joins to this table, for example:

SELECT e1.Field, e2.Field
FROM Entity1 e1
INNER JOIN MyJoinTable jt ON jt.Entity1Id = e1.Id
    AND StartDate1  @Date1
    AND StartDate2  @Date2
INNER JOIN Entity2 e2 ON e2.Id = jt.Entity2Id

My questions are:

What's the best way to index this join table?

Index on Entity1Id

Index on Entity2Id

Composite Index on all four date columns? (StartDate1, EndDate1, StartDate2, EndDate2)

What's the best way to constrain the DB so that I only get one relationship row returned for any @Date1, @Date2 combination?

Do you have any recommendations on a better data model?

Solution

Introduction

I do not have the time for extensive testing, but can suggest from where
to start.

If you rewrite the query in a more symmetrical manner, to
emphasise that both entities are joined to a two-dimensional
cross-section of MyJoinTable:

SELECT E1.Field, E2.Field
FROM MyJoinTable JT
JOIN Entity1 E1 ON E1.Id = JT.Entity1Id
JOIN Entity2 E2 ON E2.Id = JT.Entity2Id
WHERE
    StartDate1  @date1 AND
    StartDate2  @date2

you will see that one reasonably efficient way of executing it is first
to extract that cross-section and then to join the entities to it.
The following equivalent query will serve for illustration purposes
(but do not use it!):

-- A sample to demostrate the desired order or execution:
SELECT E1.Field, E2.Field
FROM
-- 1. Calculate the cross-section:
(   SELECT JT.Entity1Id, JT.Entity2Id
    FROM MyJoinTable JT
    WHERE 
        StartDate1  @date1 AND
        StartDate2  @date2
) SECT
-- 2. Join the entity tables:
JOIN Entity1 E1 ON E1.Id = SECT.Entity1Id
JOIN Entity2 E2 ON E2.Id = SECT.Entity2Id

It is the more important since you say that there is only

one relationship row returned for any @Date1, @Date2 combination.

For this reason, joining the entities in before MyJoinTable is reduced
to the required cross-section by @date1 and @date2 is inefficient as
it is likely to have many records for each entity so that the result
will have too many rows. The entites are best joined at the end, using
their natural key field, Id, which I assume is the clustered index.
The solutions below, therefore, propose different ways of calculating
this cross-section, corresponding to the SECT subquery of the example
above—
Soution I: The easiest

Let us try to add useful indices to the original query. Since MSSQL's
composite indices are hierarchical, they are useless in optimising
interval comparisons, so the best we can do with the given structure is
to index one of the date fields, yet make sure to cover all the other
fields required form the table:

CREATE NONCLUSTERED INDEX SD1 ON MyJoinTable ( StartDate1 )
INCLUDE (Entity1Id, Entity2Id, StartDate2, EndDate1, EndDate2 )

The query then will be executed in the following order:

-
Using the index SD1, perform an index seek on MyJoinTable
to find records by StartDate1, and filter them by StartDate2,
EndDate1, and EndDate2 in the
residual predicate

-
Join the entity tables using their natural key Id.

This method is not very efficient because only one of the four date
predicates is fully optimised by an index, resudual predicates being
boring data grinders.
Solution II: Set intersection

Another symmetrical way to obtain the two-dimensional JOIN
cross-section goes through INTERSECTing the results of the four date
predicates:

SELECT Entity1Id, Entity2Id
FROM
(   SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date1 >= StartDate1
    INTERSECT
    SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date2 >= StartDate2
    INTERSECT
    SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date1 <  EndDate1
    INTERSECT
    SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date2 <  EndDate2 )
SECT

With the suitable indices,

CREATE NONCLUSTERED INDEX SD1 ON MyJoinTable (StartDate1)
    INCLUDE (Id, Entity1Id, Entity2Id)
CREATE NONCLUSTERED INDEX SD2 ON MyJoinTable (StartDate2)
    INCLUDE (Id, Entity1Id, Entity2Id)
CREATE NONCLUSTERED INDEX ED1 ON MyJoinTable (EndDate1  )
    INCLUDE (Id, Entity1Id, Entity2Id)
CREATE NONCLUSTERED INDEX ED2 ON MyJoinTable (EndDate2  )
    INCLUDE (Id, Entity1Id, Entity2Id)

the plan for this query includes only clean index seek (wihout residual
predicates) and hash match operations.

Observe that the entity keys in query and indices present a
fourfold redundancy, which may be removed—at the expense of a more
complicated execution plan—by joining the entities in separately:

SELECT JT.Entity1Id, JT.Entity2Id
FROM
(   SELECT Id FROM MyJoinTable WHERE @date1 >= StartDate1
    INTERSECT
    SELECT Id FROM MyJoinTable WHERE @date2 >= StartDate2
    INTERSECT
    SELECT Id FROM MyJoinTable WHERE @date1 <  EndDate1
    INTERSECT
    SELECT Id FROM MyJoinTable WHERE @date2 <  EndDate2 )
SECT
JOIN MyJoinTable JT ON JT.Id = SECT.Id

with these indices:

CREATE NONCLUSTERED INDEX SD1 ON MyJoinTable (StartDate1) INCLUDE (Id)
CREATE NONCLUSTERED INDEX SD2 ON MyJoinTable (StartDate2) INCLUDE (Id)
CREATE NONCLUSTERED INDEX ED1 ON MyJoinTable (EndDate1  ) INCLUDE (Id)
CREATE NONCLUSTERED INDEX ED2 ON MyJoinTable (EndDate2  ) INCLUDE (Id)

But since in either case each of four date predicates, bounding the date
on one end only, does not reduce the amount of data sufficiently, hash
matches may have to
spill
data into tempdb. If they do, this method is not fit for your
environment.
Solution III: A compromise

Now we can merge solutions I & II in order to come up with a plan that
does not require so much RAM an

Code Snippets

SELECT E1.Field, E2.Field
FROM MyJoinTable JT
JOIN Entity1 E1 ON E1.Id = JT.Entity1Id
JOIN Entity2 E2 ON E2.Id = JT.Entity2Id
WHERE
    StartDate1 <= @date1 AND EndDate1 > @date1 AND
    StartDate2 <= @date2 AND EndDate2 > @date2

-- A sample to demostrate the desired order or execution:
SELECT E1.Field, E2.Field
FROM
-- 1. Calculate the cross-section:
(   SELECT JT.Entity1Id, JT.Entity2Id
    FROM MyJoinTable JT
    WHERE 
        StartDate1 <= @date1 AND EndDate1 > @date1 AND
        StartDate2 <= @date2 AND EndDate2 > @date2
) SECT
-- 2. Join the entity tables:
JOIN Entity1 E1 ON E1.Id = SECT.Entity1Id
JOIN Entity2 E2 ON E2.Id = SECT.Entity2Id

CREATE NONCLUSTERED INDEX SD1 ON MyJoinTable ( StartDate1 )
INCLUDE (Entity1Id, Entity2Id, StartDate2, EndDate1, EndDate2 )

SELECT Entity1Id, Entity2Id
FROM
(   SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date1 >= StartDate1
    INTERSECT
    SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date2 >= StartDate2
    INTERSECT
    SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date1 <  EndDate1
    INTERSECT
    SELECT Id, Entity1Id, Entity2Id FROM MyJoinTable
    WHERE @date2 <  EndDate2 )
SECT

CREATE NONCLUSTERED INDEX SD1 ON MyJoinTable (StartDate1)
    INCLUDE (Id, Entity1Id, Entity2Id)
CREATE NONCLUSTERED INDEX SD2 ON MyJoinTable (StartDate2)
    INCLUDE (Id, Entity1Id, Entity2Id)
CREATE NONCLUSTERED INDEX ED1 ON MyJoinTable (EndDate1  )
    INCLUDE (Id, Entity1Id, Entity2Id)
CREATE NONCLUSTERED INDEX ED2 ON MyJoinTable (EndDate2  )
    INCLUDE (Id, Entity1Id, Entity2Id)

Context

StackExchange Database Administrators Q#220489, answer score: 4

Revisions (0)

No revisions yet.