HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

SQL server efficiency, comparing date time of latest record and the one under it

Submitted by: @import:stackexchange-codereview··
0
Viewed 0 times
latestthecomparingsqlanddatetimerecordoneunder

Problem

I have a snippet of SQL that compares the last two records and gives the datediff in seconds. However, the way I have it is quite slow, taking up to 20 seconds to execute depending on how many controllerIDs I need to check. What would be a more efficient way of doing this?

select
                T.controllerID,
                datediff(ss, T.Max_dtReading, T1.Max_dtReading) As ElaspedTime
            from
                (
                    select
                        controllerID,
                        max(dtReading) as Max_dtReading
                    from ReaderData
                    where CardID = 'FFFFFFF0' AND (controllerID in(2,13,28,30,37,40))
                    group by controllerID
                ) as T
                outer apply(
                    select max(T1.dtReading) as Max_dtReading
                    from ReaderData as T1
                    where
                        T1.CardID = 'FFFFFFF0' AND (controllerID in(2,13,28,30,37,40))
                        and T1.controllerID = T.controllerID
                        and T1.dtReading < T.Max_dtReading
                ) as T1

Solution

Aliasing

The use of T and T1 as aliases in this query made me take twice as long to work out what it is actually doing. I would give these a more meaningful alias.
Brackets

The Where clause contains this:

AND (controllerID in(2,13,28,30,37,40))


The brackets around this are unnecessary and a little confusing. I would only recommend brackets in the Where clause when you are using both AND and OR.
Removing a select

Firstly, let's ask ourselves what this query is doing.

Step 1, get the controllerID's and MAX(dtReading)'s from the ReaderData table.
Step 2, do a datediff of the MAX(dtReading) and the next highest dtReading.

First lets write a simple query to do step 1:

SELECT
    base.controllerID,
    MAX(base.dtReading) AS [Most Recent Reading Date]
FROM ReaderData AS base
WHERE base.CardID = 'FFFFFFF0' 
AND base.controllerID IN (2,13,28,30,37,40)
GROUP BY base_data.controllerID


Now let's bring in the second dtReading:

SELECT
    base.controllerID,
    MAX(base.dtReading) AS [Most Recent Reading Date],
    prior.[Prior Reading Date]
FROM ReaderData AS base
OUTER APPLY
(
    SELECT MAX(extra.dtReading) AS [Prior Reading Date]
    FROM ReaderData AS prior
    WHERE prior.CardID = base.CardID
    AND prior.controllerID = base_data.controllerID
    AND prior.dtReading < base_data.Max_dtReading
) AS prior
WHERE base.CardID = 'FFFFFFF0' 
AND base.controllerID IN (2,13,28,30,37,40)
GROUP BY base_data.controllerID


You'll notice that I changed the CardID = 'FFFFFFF0' and controllerID IN (2,13,28,30,37,40) in the apply to check if it is equal to the value in the main select. It is easier to understand this way, you want to make sure that they are equal, plus it lets you avoid the possible mistake of forgetting to update of the sets of values.

Then we add the datediff:

SELECT
    base.controllerID AS [Controller ID],
    DATEDIFF(SECOND,MAX(base.dtReading),prior.[Prior Reading Date]) AS [Elasped Time]
FROM ReaderData AS base
OUTER APPLY
(
    SELECT MAX(extra.dtReading) AS [Prior Reading Date]
    FROM ReaderData AS prior
    WHERE prior.CardID = base.CardID
    AND prior.controllerID = base_data.controllerID
    AND prior.dtReading < base_data.Max_dtReading
) AS prior
WHERE base.CardID = 'FFFFFFF0' 
AND base.controllerID IN (2,13,28,30,37,40)
GROUP BY base_data.controllerID


Here I took the liberty of aliasing the returned columns to make them a little easier on the eye for the end user, as well as capitalising keywords.

There is no agreed upon standard to SQL, so feel free to re-lowercase everything, but conventionally SQL is written with keywords in ALL CAPS
Performance Increases

The query I walked through above should perform better than the one in the question as it avoids the sub-select.

I would make sure that the ReaderData table is properly indexed, this can and will drastically speed up queries and make sure that the statistics on the table are up to date.

Code Snippets

AND (controllerID in(2,13,28,30,37,40))
SELECT
    base.controllerID,
    MAX(base.dtReading) AS [Most Recent Reading Date]
FROM ReaderData AS base
WHERE base.CardID = 'FFFFFFF0' 
AND base.controllerID IN (2,13,28,30,37,40)
GROUP BY base_data.controllerID
SELECT
    base.controllerID,
    MAX(base.dtReading) AS [Most Recent Reading Date],
    prior.[Prior Reading Date]
FROM ReaderData AS base
OUTER APPLY
(
    SELECT MAX(extra.dtReading) AS [Prior Reading Date]
    FROM ReaderData AS prior
    WHERE prior.CardID = base.CardID
    AND prior.controllerID = base_data.controllerID
    AND prior.dtReading < base_data.Max_dtReading
) AS prior
WHERE base.CardID = 'FFFFFFF0' 
AND base.controllerID IN (2,13,28,30,37,40)
GROUP BY base_data.controllerID
SELECT
    base.controllerID AS [Controller ID],
    DATEDIFF(SECOND,MAX(base.dtReading),prior.[Prior Reading Date]) AS [Elasped Time]
FROM ReaderData AS base
OUTER APPLY
(
    SELECT MAX(extra.dtReading) AS [Prior Reading Date]
    FROM ReaderData AS prior
    WHERE prior.CardID = base.CardID
    AND prior.controllerID = base_data.controllerID
    AND prior.dtReading < base_data.Max_dtReading
) AS prior
WHERE base.CardID = 'FFFFFFF0' 
AND base.controllerID IN (2,13,28,30,37,40)
GROUP BY base_data.controllerID

Context

StackExchange Code Review Q#86117, answer score: 3

Revisions (0)

No revisions yet.