snippetsqlMinor
How to rewrite a query with MAX in WHERE clause
Viewed 0 times
withquerywhererewritemaxhowclause
Problem
I have a query with a
Please note that both tables are the same.
I rewrote it as:
but number of records is different.
Can anyone say how can I rewrite it for better performance?
max in the where clause, which is slow.select count(*)
from TableName tbl1
where tbl1.id = (
select max(tbl2.id)
from TableName tbl2
where tbl2.companyId = tbl1.companyId)
and tbl1.active = 2Please note that both tables are the same.
I rewrote it as:
select count(*)
from TableName tbl1
inner join (
select max(id) as id, companyId
from TableName tbl2
where tbl2.active= 2
group by companyId
) tbl2
on tbl2.companyId = tbl1.companyId
and tbl2.id=tbl1.idbut number of records is different.
Can anyone say how can I rewrite it for better performance?
Solution
Summary
Add the index:
Try this rewrite:
Detail
The original query specification appears to be:
How many rows are there in the table where active = 2; and the id column equals the highest id value across all rows within the same companyid (without the active = 2 restriction, for clarity).
There is a way to get these results with a single scan of a suitable index.
For example, given the table:
A few rows of sample data:
With the index:
A query (based on the original) to find rows without the predicate (active = 2) is:
The execution plan for that query features a single scan of the index, despite the query referencing the same table twice:
The basic idea is to scan the index in key order, returning the row(s) with the highest id within each group of companyid values.
Unfortunately, we cannot simply add the predicate active = 2 to that query without breaking the optimizer transformation that gives us the single scan. The problem (if you're interested) is that the new predicate gets pushed down the plan tree, resulting in a shape that no longer matches the 'group by top' transform.
One workaround adds a meaningless
The execution plan is similar to before, with an extra Top (the meaningless, bigint.max one we added), a Filter for the active = 2 predicate, and a final Stream Aggregate to count the rows:
I cannot promise you will get this plan shape with the real data, or that it will be faster. It all rather depends on the number and distribution of the rows.
The new Top also has the effect of disfavouring a parallel plan (the group by top is parallel-compatible, but a global top is not; parallelism would have to stop and restart around it).
Nevertheless, until a better option comes along, or more detail is provided, it is something you could try.
Add the index:
CREATE INDEX index_name
ON TableName
(companyid DESC, id DESC)
INCLUDE
(active);Try this rewrite:
SELECT COUNT_BIG(*)
FROM
(
SELECT TOP (9223372036854775807)
TN1.active
FROM TableName AS TN1
WHERE
TN1.id =
(
SELECT MAX(TN2.id)
FROM TableName AS TN2
WHERE
TN2.companyid = TN1.companyid
)
) AS SQ1
WHERE
SQ1.active = 2;Detail
The original query specification appears to be:
How many rows are there in the table where active = 2; and the id column equals the highest id value across all rows within the same companyid (without the active = 2 restriction, for clarity).
There is a way to get these results with a single scan of a suitable index.
For example, given the table:
CREATE TABLE #TableName
(
companyid integer NOT NULL,
id integer NOT NULL,
active integer NOT NULL
);A few rows of sample data:
INSERT #TableName
(companyid, id, active)
VALUES
(1, 1, 2),
(1, 2, 2),
(1, 3, 0),
(2, 2, 2),
(2, 3, 2),
(2, 3, 0);With the index:
CREATE INDEX index_name
ON #TableName
(companyid DESC, id DESC)
INCLUDE
(active);A query (based on the original) to find rows without the predicate (active = 2) is:
SELECT
TN1.active
FROM #TableName AS TN1
WHERE
TN1.id =
(
SELECT MAX(TN2.id)
FROM #TableName AS TN2
WHERE
TN2.companyid = TN1.companyid
);The execution plan for that query features a single scan of the index, despite the query referencing the same table twice:
The basic idea is to scan the index in key order, returning the row(s) with the highest id within each group of companyid values.
Unfortunately, we cannot simply add the predicate active = 2 to that query without breaking the optimizer transformation that gives us the single scan. The problem (if you're interested) is that the new predicate gets pushed down the plan tree, resulting in a shape that no longer matches the 'group by top' transform.
One workaround adds a meaningless
TOP specification around the part of the query we want transformed, preventing the new predicate being pushed past it:SELECT COUNT_BIG(*)
FROM
(
SELECT TOP (9223372036854775807)
TN1.active
FROM #TableName AS TN1
WHERE
TN1.id =
(
SELECT MAX(TN2.id)
FROM #TableName AS TN2
WHERE
TN2.companyid = TN1.companyid
)
) AS SQ1
WHERE
SQ1.active = 2;The execution plan is similar to before, with an extra Top (the meaningless, bigint.max one we added), a Filter for the active = 2 predicate, and a final Stream Aggregate to count the rows:
I cannot promise you will get this plan shape with the real data, or that it will be faster. It all rather depends on the number and distribution of the rows.
The new Top also has the effect of disfavouring a parallel plan (the group by top is parallel-compatible, but a global top is not; parallelism would have to stop and restart around it).
Nevertheless, until a better option comes along, or more detail is provided, it is something you could try.
Code Snippets
CREATE INDEX index_name
ON TableName
(companyid DESC, id DESC)
INCLUDE
(active);SELECT COUNT_BIG(*)
FROM
(
SELECT TOP (9223372036854775807)
TN1.active
FROM TableName AS TN1
WHERE
TN1.id =
(
SELECT MAX(TN2.id)
FROM TableName AS TN2
WHERE
TN2.companyid = TN1.companyid
)
) AS SQ1
WHERE
SQ1.active = 2;CREATE TABLE #TableName
(
companyid integer NOT NULL,
id integer NOT NULL,
active integer NOT NULL
);INSERT #TableName
(companyid, id, active)
VALUES
(1, 1, 2),
(1, 2, 2),
(1, 3, 0),
(2, 2, 2),
(2, 3, 2),
(2, 3, 0);CREATE INDEX index_name
ON #TableName
(companyid DESC, id DESC)
INCLUDE
(active);Context
StackExchange Database Administrators Q#136389, answer score: 7
Revisions (0)
No revisions yet.