HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How can I improve the performance of a sorted join?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
canthejoinimproveperformancehowsorted

Problem

This feels like such a common question, I'll understand if it is closed but if so please suggest a better place I could ask. I have the following two tables of interest:

CREATE TABLE [dbo].[Sessions]
(
    [Id] [int] PRIMARY KEY,
    [DateConnected] [datetime] NOT NULL,
    [Origin] [nvarchar](max) NULL,
    [TrackerId] [int] NULL,
    [Imei] [nvarchar](max) NULL,
    [Sim] [nvarchar](max) NULL,
    [ProtocolVersion] [tinyint] NULL
)

CREATE TABLE [dbo].[PacketTransmissions]
(
    [Id] [int] PRIMARY KEY,
    [RequestId] [int] NULL,
    [SessionId] [int] NOT NULL,
    [DateProcessed] [datetime] NOT NULL,
    [Direction] [int] NOT NULL,
    [Sequence] [int] NOT NULL,
    [Acknowledgement] [int] NOT NULL,
    [DateRecorded] [datetime] NOT NULL,
    [Version] [tinyint] NOT NULL,
    [Command] [tinyint] NOT NULL,
    [Flags] [tinyint] NOT NULL,
    [Checksum] [tinyint] NOT NULL,
    [Data] [varbinary](max) NULL
)

CREATE NONCLUSTERED INDEX [IX_TrackerId_DateConnected] ON [dbo].[Sessions]
(
    [TrackerId] ASC,
    [DateConnected] ASC
)

CREATE NONCLUSTERED INDEX [IX_SessionId_DateProcessed] ON [dbo].[PacketTransmissions]
(
    [SessionId] ASC,
    [DateProcessed] ASC
)
INCLUDE ([Direction], [Sequence], [Acknowledgement], [Command])


My most common query, and most expensive (quite often times out now) involves listing all packet transmissions for a particular tracker.

DECLARE @TrackerId INT = 10
DECLARE @StartDate DATETIME2 = '2018-03-10'
DECLARE @EndDate   DATETIME2 = '2018-03-12'

SELECT [PacketTransmissions].*
FROM [Sessions]
JOIN [PacketTransmissions] ON [PacketTransmissions].[SessionId] = [Sessions].[Id]
WHERE [Sessions].[TrackerId] = @TrackerId
AND [PacketTransmissions].[DateProcessed] > @StartDate
AND [PacketTransmissions].[DateProcessed] < @EndDate
ORDER BY [PacketTransmissions].[DateProcessed] DESC


This was good at first, but now there is a lot of data, it has slowed right down. My attempt to get the query plan today took 2 minutes, and shows t

Solution

Perhaps this is crazy, but I like to try a bit of blue-sky-thinking every once in a while, so I'd consider adding the TrackerId column to the dbo.PacketTransmissions table to avoid the join completely. Obviously, this means you need to modify the row-insert procedure for the table, which may or may not be feasible.

However, this change, combined with a simple index:

CREATE INDEX IX_PacketTransmissions ON dbo.PacketTransmissions
(
    TrackerId ASC
    , DateProcessed ASC
) 
INCLUDE (Id); --not strictly required, since the primary key 
              --is always included in every non-clustered index
              --I include them just to be explicit


creates a query plan using a run-of-the-mill index seek, combined with a key lookup for each row returned. As in:

To test this, I created a minimally complete verifiable example:

USE tempdb;

IF OBJECT_ID(N'dbo.Sessions', N'U') IS NOT NULL
DROP TABLE dbo.[Sessions];
IF OBJECT_ID(N'dbo.PacketTransmissions', N'U') IS NOT NULL
DROP TABLE dbo.PacketTransmissions;
GO

CREATE TABLE [dbo].[Sessions]
(
      [Id] int 
        CONSTRAINT PK_Sessions
        PRIMARY KEY CLUSTERED
    , [DateConnected] datetime NOT NULL
    , [Origin] nvarchar(max) NULL
    , [TrackerId] int NULL
    , [Imei] nvarchar(max) NULL
    , [Sim] nvarchar(max) NULL
    , [ProtocolVersion] tinyint NULL
)

CREATE TABLE [dbo].[PacketTransmissions]
(
      [Id] int 
        CONSTRAINT PK_PacketTransmissions 
        PRIMARY KEY CLUSTERED
    , [RequestId] int NULL
    , [SessionId] int NOT NULL
    , [DateProcessed] datetime NOT NULL
    , [Direction] int NOT NULL
    , [Sequence] int NOT NULL
    , [Acknowledgement] int NOT NULL
    , [DateRecorded] datetime NOT NULL
    , [Version] tinyint NOT NULL
    , [Command] tinyint NOT NULL
    , [Flags] tinyint NOT NULL
    , [Checksum] tinyint NOT NULL
    , [Data] varbinary(max) NULL
    , [TrackerId] int NULL
)
GO

INSERT INTO dbo.[Sessions] (Id, DateConnected, Origin, TrackerId, Imei, Sim, ProtocolVersion)
SELECT ROW_NUMBER() OVER (ORDER BY sc1.id)
    , DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2017-01-01 00:00:00')
    , CONVERT(nvarchar(max), CRYPT_GEN_RANDOM(128))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(nvarchar(40), CRYPT_GEN_RANDOM(38))
    , CONVERT(nvarchar(40), CRYPT_GEN_RANDOM(38))
    , CONVERT(tinyint, CRYPT_GEN_RANDOM(1))
FROM sys.syscolumns sc1
    CROSS JOIN sys.syscolumns sc2;

INSERT INTO dbo.PacketTransmissions (Id, RequestId, SessionId, DateProcessed, Direction, Sequence, Acknowledgement, DateRecorded, Version, Command, Flags, Checksum, Data, TrackerId)
SELECT ROW_NUMBER() OVER (ORDER BY s.Id)
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(3))
    , DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2017-01-01 00:00:00')
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(2))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2017-01-01 00:00:00')
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CRYPT_GEN_RANDOM(128)
    , s.TrackerId
FROM dbo.[Sessions] s
    CROSS JOIN (SELECT v.n
    FROM (VALUES (0), (1))v(n)) v;
GO


On my system, this creates around 700,000 session rows, and double that number of transmission rows.

The query then becomes:

DECLARE @TrackerId int = 100;
DECLARE @StartDate datetime = '2017-03-10';
DECLARE @EndDate   datetime = '2017-03-12';

SELECT [PacketTransmissions].*
FROM [PacketTransmissions] 
WHERE [PacketTransmissions].[TrackerId] = @TrackerId
    AND [PacketTransmissions].[DateProcessed] > @StartDate
    AND [PacketTransmissions].[DateProcessed] < @EndDate
ORDER BY [PacketTransmissions].[DateProcessed] DESC;

Code Snippets

CREATE INDEX IX_PacketTransmissions ON dbo.PacketTransmissions
(
    TrackerId ASC
    , DateProcessed ASC
) 
INCLUDE (Id); --not strictly required, since the primary key 
              --is always included in every non-clustered index
              --I include them just to be explicit
USE tempdb;

IF OBJECT_ID(N'dbo.Sessions', N'U') IS NOT NULL
DROP TABLE dbo.[Sessions];
IF OBJECT_ID(N'dbo.PacketTransmissions', N'U') IS NOT NULL
DROP TABLE dbo.PacketTransmissions;
GO

CREATE TABLE [dbo].[Sessions]
(
      [Id] int 
        CONSTRAINT PK_Sessions
        PRIMARY KEY CLUSTERED
    , [DateConnected] datetime NOT NULL
    , [Origin] nvarchar(max) NULL
    , [TrackerId] int NULL
    , [Imei] nvarchar(max) NULL
    , [Sim] nvarchar(max) NULL
    , [ProtocolVersion] tinyint NULL
)

CREATE TABLE [dbo].[PacketTransmissions]
(
      [Id] int 
        CONSTRAINT PK_PacketTransmissions 
        PRIMARY KEY CLUSTERED
    , [RequestId] int NULL
    , [SessionId] int NOT NULL
    , [DateProcessed] datetime NOT NULL
    , [Direction] int NOT NULL
    , [Sequence] int NOT NULL
    , [Acknowledgement] int NOT NULL
    , [DateRecorded] datetime NOT NULL
    , [Version] tinyint NOT NULL
    , [Command] tinyint NOT NULL
    , [Flags] tinyint NOT NULL
    , [Checksum] tinyint NOT NULL
    , [Data] varbinary(max) NULL
    , [TrackerId] int NULL
)
GO

INSERT INTO dbo.[Sessions] (Id, DateConnected, Origin, TrackerId, Imei, Sim, ProtocolVersion)
SELECT ROW_NUMBER() OVER (ORDER BY sc1.id)
    , DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2017-01-01 00:00:00')
    , CONVERT(nvarchar(max), CRYPT_GEN_RANDOM(128))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(nvarchar(40), CRYPT_GEN_RANDOM(38))
    , CONVERT(nvarchar(40), CRYPT_GEN_RANDOM(38))
    , CONVERT(tinyint, CRYPT_GEN_RANDOM(1))
FROM sys.syscolumns sc1
    CROSS JOIN sys.syscolumns sc2;

INSERT INTO dbo.PacketTransmissions (Id, RequestId, SessionId, DateProcessed, Direction, Sequence, Acknowledgement, DateRecorded, Version, Command, Flags, Checksum, Data, TrackerId)
SELECT ROW_NUMBER() OVER (ORDER BY s.Id)
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(3))
    , DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2017-01-01 00:00:00')
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(2))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2017-01-01 00:00:00')
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CONVERT(int, CRYPT_GEN_RANDOM(1))
    , CRYPT_GEN_RANDOM(128)
    , s.TrackerId
FROM dbo.[Sessions] s
    CROSS JOIN (SELECT v.n
    FROM (VALUES (0), (1))v(n)) v;
GO
DECLARE @TrackerId int = 100;
DECLARE @StartDate datetime = '2017-03-10';
DECLARE @EndDate   datetime = '2017-03-12';

SELECT [PacketTransmissions].*
FROM [PacketTransmissions] 
WHERE [PacketTransmissions].[TrackerId] = @TrackerId
    AND [PacketTransmissions].[DateProcessed] > @StartDate
    AND [PacketTransmissions].[DateProcessed] < @EndDate
ORDER BY [PacketTransmissions].[DateProcessed] DESC;

Context

StackExchange Database Administrators Q#199945, answer score: 3

Revisions (0)

No revisions yet.