patternsqlMinor
What order should my index's columns be for a SELECT a, b, MAX( c ) FROM d WHERE e = 1 GROUP BY a, b?
Viewed 0 times
ordercolumnswhatgroupwhereselectmaxforshouldindex
Problem
Supposing I have this table (in a multi-tenant SaaS database:
I have a
-
I currently have foreign-key-covering (non-clustered) indexes over
-
Without any other extra
-
So I need to create a good index for this
CREATE TABLE dbo.Messages (
TenantId int NOT NULL,
RecipientId int NOT NULL,
MessageId int NOT NULL IDENTITY,
SentUtc datetime2(7) NOT NULL,
IsDeleted bit NOT NULL, /* Less than 5% of rows have IsDeleted = 1 */
-- Body, Subject, etc
CONSTRAINT PK_Messages PRIMARY KEY ( MessageId, TenantId ),
CONSTRAINT UK_Messages UNIQUE ( MessageId ),
CONSTRAINT FK_Messages_Tenants FOREIGN KEY ( TenantId ) REFERENCES dbo.Tenants ( TenantId ),
CONSTRAINT FK_Messages_Recipients FOREIGN KEY ( RecipientId, TenantId ) REFERENCES dbo.Recipients ( RecipientId, TenantId )
);I have a
VIEW that is meant to get the latest MessageId for each RecipientId:CREATE VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
WITH latestMessageUtcPerRecipient AS (
SELECT
m.TenantId,
m.RecipientId,
MAX( m.SentUtc ) AS MaxSentUtc
FROM
dbo.Messages AS m
WHERE
m.IsDeleted = 0
GROUP BY
m.TenantId,
m.RecipientId
)
SELECT
lm.TenantId,
lm.RecipientId,
m .MessageId,
m .SentUtc
FROM
dbo.Messages AS m
INNER JOIN latestMessageUtcPerRecipient AS lm ON
m.TenantId = lm.TenantId
AND
m.RecipientId = lm.RecipientId
AND
m.SentUtc = lm.MaxSentUtc-
I currently have foreign-key-covering (non-clustered) indexes over
FX_Tenants ( TenantId ) and FX_Recipients ( RecipientId, TenantId ).-
Without any other extra
INDEX objects on dbo.Messages, this VIEW's execution-plan currently performs an Index Scan on the FX_Recipients index where it reads 14,000% of the final query rows (erk!)-
So I need to create a good index for this
VIEW's query that covers the TenantId, RecipientId, and SentUtc - and I understand I should exclude IsDeleted from the index as is a low-selectivity column) - the problSolution
Index and View
I would probably replace the
There's not too much benefit to
You could make this index filtered on
You would also need to be careful that your queries don't ever supply a parameterized value for
Add the missing
Now, a query on the view uses a very efficient Segment Top plan:
Alternate Views
If it were my code, I might express the view equivalently as:
But this is mostly a question of style.
You might also prefer:
This definition can also work well with the suggested index:
Window Functions
I would probably avoid writing the view with a
For example:
With the windowing function approach, the predicate on
This aspect is not a concern if you are running SQL Server 2017 CU30 or later with query optimizer hotfixes enabled, or SQL Server 2022 (where the hotfix setting is not required). In those configurations, the optimizer will push the predicate down to obtain a seek.
I would probably replace the
FX indexes with:CREATE NONCLUSTERED INDEX
[dbo.Messages TenantID, RecipientId, SentUtc, (MessageId, IsDeleted)]
ON dbo.[Messages]
(TenantID ASC, RecipientId ASC, SentUtc ASC)
INCLUDE
(MessageId, IsDeleted);There's not too much benefit to
MessageId being an INCLUDE rather than at the end of the key. It will only be non-key at the leaf of the index because the index is not specified as unique. MessageId is part of the clustering key so it will be part of the nonclustered index key above the leaf level. It doesn't make a huge amount of difference either way.You could make this index filtered on
IsDeleted = 0 but whether this is worthwhile depends on how many deleted records you have and how your foreign keys are set up for deleted rows.You would also need to be careful that your queries don't ever supply a parameterized value for
IsDeleted (or that a supplied literal isn't simple/forced parameterized on the server) because then the filtered index can't be matched. There are also other issues. Initially, I probably wouldn't bother with it.Add the missing
IsDeleted = 0 predicate to your view:CREATE OR ALTER VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
WITH latestMessageUtcPerRecipient AS (
SELECT
m.TenantId,
m.RecipientId,
MAX( m.SentUtc ) AS MaxSentUtc
FROM
dbo.Messages AS m
WHERE
m.IsDeleted = 0
GROUP BY
m.TenantId,
m.RecipientId
)
SELECT
lm.TenantId,
lm.RecipientId,
m .MessageId,
m .SentUtc
FROM
dbo.Messages AS m
INNER JOIN latestMessageUtcPerRecipient AS lm ON
m.TenantId = lm.TenantId
AND
m.RecipientId = lm.RecipientId
AND
m.SentUtc = lm.MaxSentUtc
WHERE
m.IsDeleted = 0; -- This was missingNow, a query on the view uses a very efficient Segment Top plan:
SELECT LM.*
FROM dbo.LatestMessages AS LM;Alternate Views
If it were my code, I might express the view equivalently as:
CREATE OR ALTER VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
SELECT
M.TenantId,
M.RecipientId,
M.MessageId,
M.SentUtc
FROM dbo.[Messages] AS M
WHERE
M.IsDeleted = 0
AND M.SentUtc =
(
SELECT
MAX(M2.SentUtc)
FROM dbo.[Messages] AS M2
WHERE
-- Correlations
M2.TenantId = M.TenantId
AND M2.RecipientId = M.RecipientId
AND M2.IsDeleted = M.IsDeleted
);But this is mostly a question of style.
You might also prefer:
CREATE OR ALTER VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
SELECT
TR.TenantId,
TR.RecipientId,
T1.MessageId,
T1.SentUtc
FROM
(
SELECT
R.TenantId,
R.RecipientId
FROM dbo.Recipients AS R
) AS TR
CROSS APPLY
(
SELECT TOP (1)
M.MessageId,
M.SentUtc
FROM dbo.[Messages] AS M
WHERE
M.TenantId = TR.TenantId
AND M.RecipientId = TR.RecipientId
AND M.IsDeleted = 0
ORDER BY
M.SentUtc DESC
) AS T1;This definition can also work well with the suggested index:
Window Functions
I would probably avoid writing the view with a
DENSE_RANK windowing function because SQL Server has had some problems pushing parameterized or variable predicates down into such views. As I mention in the linked article, you can work around this by replacing the view with a parameterized inline TVF, though that is not always convenient.For example:
DECLARE @TenantId integer = 1;
SELECT
LM.*
FROM dbo.LatestMessages AS LM
WHERE
LM.TenantId = @TenantId;With the windowing function approach, the predicate on
TenantId will not be pushed past the function, resulting in an index scan where you would expect a seek. It can be hard to spot because the stuck filter ends up combined with the existing filter for DENSE_RANK = 1.This aspect is not a concern if you are running SQL Server 2017 CU30 or later with query optimizer hotfixes enabled, or SQL Server 2022 (where the hotfix setting is not required). In those configurations, the optimizer will push the predicate down to obtain a seek.
Code Snippets
CREATE NONCLUSTERED INDEX
[dbo.Messages TenantID, RecipientId, SentUtc, (MessageId, IsDeleted)]
ON dbo.[Messages]
(TenantID ASC, RecipientId ASC, SentUtc ASC)
INCLUDE
(MessageId, IsDeleted);CREATE OR ALTER VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
WITH latestMessageUtcPerRecipient AS (
SELECT
m.TenantId,
m.RecipientId,
MAX( m.SentUtc ) AS MaxSentUtc
FROM
dbo.Messages AS m
WHERE
m.IsDeleted = 0
GROUP BY
m.TenantId,
m.RecipientId
)
SELECT
lm.TenantId,
lm.RecipientId,
m .MessageId,
m .SentUtc
FROM
dbo.Messages AS m
INNER JOIN latestMessageUtcPerRecipient AS lm ON
m.TenantId = lm.TenantId
AND
m.RecipientId = lm.RecipientId
AND
m.SentUtc = lm.MaxSentUtc
WHERE
m.IsDeleted = 0; -- This was missingSELECT LM.*
FROM dbo.LatestMessages AS LM;CREATE OR ALTER VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
SELECT
M.TenantId,
M.RecipientId,
M.MessageId,
M.SentUtc
FROM dbo.[Messages] AS M
WHERE
M.IsDeleted = 0
AND M.SentUtc =
(
SELECT
MAX(M2.SentUtc)
FROM dbo.[Messages] AS M2
WHERE
-- Correlations
M2.TenantId = M.TenantId
AND M2.RecipientId = M.RecipientId
AND M2.IsDeleted = M.IsDeleted
);CREATE OR ALTER VIEW dbo.LatestMessages
WITH SCHEMABINDING
AS
SELECT
TR.TenantId,
TR.RecipientId,
T1.MessageId,
T1.SentUtc
FROM
(
SELECT
R.TenantId,
R.RecipientId
FROM dbo.Recipients AS R
) AS TR
CROSS APPLY
(
SELECT TOP (1)
M.MessageId,
M.SentUtc
FROM dbo.[Messages] AS M
WHERE
M.TenantId = TR.TenantId
AND M.RecipientId = TR.RecipientId
AND M.IsDeleted = 0
ORDER BY
M.SentUtc DESC
) AS T1;Context
StackExchange Database Administrators Q#330840, answer score: 8
Revisions (0)
No revisions yet.