debugsqlCritical
Execution plan shows expensive CONVERT_IMPLICIT operation. Can I fix this with indexing or do I need to change the table?
Viewed 0 times
fixthiscantheneedwithindexingchangetableoperation
Problem
I have a really important, really slow view which includes some really ugly conditions like this in its where clause. I am also aware that the joins are gross and slow joins on
```
CREATE VIEW [dbo].[vwReallySlowView] AS
AS
SELECT
I.booking_no_v32 AS bkno,
I.trans_type_v41 AS trantype,
B.Assigned_to_v61 AS Assignbk,
B.order_date AS dateo, B.HourBooked AS HBooked,
B.MinBooked AS MBooked, B.SecBooked AS SBooked,
I.prep_on AS Pon, I.From_locn AS Flocn,
I.Trans_to_locn AS TTlocn,
(CASE I.prep_on WHEN 'Y' THEN I.PDate ELSE I.FirstDate END) AS PrDate, I.PTimeH AS PrTimeH, I.PTimeM AS PrTimeM,
(CASE WHEN I.RetnDate = I.QtyCheckedOut) THEN I.Trans_Qty - I.QtyCheckedOut ELSE I.trans_qty END) AS trqty,
(CASE WHEN I.Trans_type_v41 IN (6, 7) THEN 0 ELSE I.QtyCheckedOut END) AS MyQtycheckedout, (CASE WHEN I.Trans_type_v41 IN (6, 7)
THEN 0 ELSE I.QtyReturned END) AS retqty, I.ID, B.BookingProgressStatus AS bkProg, I.product_code_v42, I.return_to_locn, I.AssignTo, I.AssignType,
I.QtyReserved, B.DeprepOn,
(CASE B.DeprepOn
WHEN 1 THEN B.DeprepDateTime
ELSE I.RetnDate
END) AS DeprepDateTime, I.InRack
FROM dbo.tblItemtran AS I
INNER JOIN -- booking_no = varchar(13)
dbo.tblbookings AS B ON B.booking_no = I.booking_no_v32 -- string inner-join
INNER JOIN -- product_code = varchar(13)
dbo.tblInvmas AS M ON I.product_code_v42 = M.product_code -- string inner-join
WHERE (I.trans_type_v41 NOT IN (2, 3, 7, 18, 19, 20, 21, 12, 13, 22)) AND (I.trans_type_v41 NOT IN (6, 7)) AND (I.bit_field_v41 & 4 = 0) OR
(I.trans_type_v41 NOT IN (6, 7)) AND (I.bit_field_v41 & 4 = 0) AND (B.BookingProgressStatus = 1
varchar(13) instead of integer identity fields, but would like to improve the the simple query below that uses this view:```
CREATE VIEW [dbo].[vwReallySlowView] AS
AS
SELECT
I.booking_no_v32 AS bkno,
I.trans_type_v41 AS trantype,
B.Assigned_to_v61 AS Assignbk,
B.order_date AS dateo, B.HourBooked AS HBooked,
B.MinBooked AS MBooked, B.SecBooked AS SBooked,
I.prep_on AS Pon, I.From_locn AS Flocn,
I.Trans_to_locn AS TTlocn,
(CASE I.prep_on WHEN 'Y' THEN I.PDate ELSE I.FirstDate END) AS PrDate, I.PTimeH AS PrTimeH, I.PTimeM AS PrTimeM,
(CASE WHEN I.RetnDate = I.QtyCheckedOut) THEN I.Trans_Qty - I.QtyCheckedOut ELSE I.trans_qty END) AS trqty,
(CASE WHEN I.Trans_type_v41 IN (6, 7) THEN 0 ELSE I.QtyCheckedOut END) AS MyQtycheckedout, (CASE WHEN I.Trans_type_v41 IN (6, 7)
THEN 0 ELSE I.QtyReturned END) AS retqty, I.ID, B.BookingProgressStatus AS bkProg, I.product_code_v42, I.return_to_locn, I.AssignTo, I.AssignType,
I.QtyReserved, B.DeprepOn,
(CASE B.DeprepOn
WHEN 1 THEN B.DeprepDateTime
ELSE I.RetnDate
END) AS DeprepDateTime, I.InRack
FROM dbo.tblItemtran AS I
INNER JOIN -- booking_no = varchar(13)
dbo.tblbookings AS B ON B.booking_no = I.booking_no_v32 -- string inner-join
INNER JOIN -- product_code = varchar(13)
dbo.tblInvmas AS M ON I.product_code_v42 = M.product_code -- string inner-join
WHERE (I.trans_type_v41 NOT IN (2, 3, 7, 18, 19, 20, 21, 12, 13, 22)) AND (I.trans_type_v41 NOT IN (6, 7)) AND (I.bit_field_v41 & 4 = 0) OR
(I.trans_type_v41 NOT IN (6, 7)) AND (I.bit_field_v41 & 4 = 0) AND (B.BookingProgressStatus = 1
Solution
You shouldn't rely too much on cost percentages in execution plans. These are always estimated costs, even in post-execution plans with 'actual' numbers for things like row counts. The estimated costs are based on a model that happens to work pretty well for the purpose it is intended for: enabling the optimizer to choose between different candidate execution plans for the same query. The cost information is interesting, and a factor to consider, but it should rarely be a primary metric for query tuning. Interpreting execution plan information requires a broader view of the presented data.
ItemTran Clustered Index Seek Operator
This operator is really two operations in one. First an index seek operation finds all rows that match the predicate
The conversion occurs because the bitwise-AND operator (&) requires both operands to be of the same type. The implicit type of the constant value '4' is integer and the data type precedence rules mean the lower-priority
The problem (such as it is) is easily corrected by writing the predicate as
The major part of the estimated cost of this seek is down to the size of the base table. While the clustered index key is itself reasonably narrow, the size of each row is large. A definition for the table is not given, but just the columns used in the view add up to a significant row width. Since the clustered index includes all columns, the distance between clustered index keys is the width of the row, not the width of the index keys. The use of version suffixes on some columns suggests the real table has even more columns for previous versions.
Looking at the seek, residual predicate and output columns, the performance of this operator could be checked in isolation by building the equivalent query (the
The performance of this query with a cold data cache is of interest, since read-ahead would be affected by table (clustered index) fragmentation. The clustering key for this table invites fragmentation, so it could be important to maintain (reorganize or rebuild) this index regularly, and use an appropriate
I performed a test of the effect of fragmentation on read-ahead using sample data generated using SQL Data Generator. Using the same table row counts as shown in the question's query plan, a highly fragmented clustered index resulted in
If the table data is typically entirely in cache the fragmentation issue is very much less important. But, even with low fragmentation, the wide table rows might mean the number of logical and physical reads is much higher than might be expected. You could also experiment with adding and removing the explicit
More to the point is the estimated number of rows leaving the seek operator. The optimization-time estimate is 165 rows, but 4,226 were produced at execution time. I will return to this point later, but the main reason for the discrepancy is that the selectivity of the residual predicate (involving the bitwise-AND) is very hard for the optimizer to predict - in fact it resorts to guessing.
Filter Operator
I am showing the filter predicate here mostly to illustrate how the two
```
SELECT
it.booking_no_v32,
it.trans_type_v41,
it.Trans_qty,
it.Qt
ItemTran Clustered Index Seek Operator
This operator is really two operations in one. First an index seek operation finds all rows that match the predicate
product_code_v42 = 'M10BOLT', then each row has the residual predicate bit_field_v41 & 4 = 0 applied. There is an implicit conversion of bit_field_v41 from its base type (tinyint or smallint) to integer.The conversion occurs because the bitwise-AND operator (&) requires both operands to be of the same type. The implicit type of the constant value '4' is integer and the data type precedence rules mean the lower-priority
bit_field_v41 field value is converted.The problem (such as it is) is easily corrected by writing the predicate as
bit_field_v41 & CONVERT(tinyint, 4) = 0 - meaning the constant value has the lower priority and is converted (during constant folding) rather than the column value. If the bit_field_v41 is tinyint no conversions occur at all. Likewise, CONVERT(smallint, 4) could be used if bit_field_v41 is smallint. That said, the conversion is not a performance issue in this case, but it is still good practice to match types and avoid implicit conversions where possible.The major part of the estimated cost of this seek is down to the size of the base table. While the clustered index key is itself reasonably narrow, the size of each row is large. A definition for the table is not given, but just the columns used in the view add up to a significant row width. Since the clustered index includes all columns, the distance between clustered index keys is the width of the row, not the width of the index keys. The use of version suffixes on some columns suggests the real table has even more columns for previous versions.
Looking at the seek, residual predicate and output columns, the performance of this operator could be checked in isolation by building the equivalent query (the
1 <> 2 is a trick to prevent auto-parameterization, the contradiction is removed by the optimizer and does not appear in the query plan):SELECT
it.booking_no_v32,
it.QtyCheckedOut,
it.QtyReturned,
it.Trans_qty,
it.trans_type_v41
FROM dbo.tblItemTran AS it
WHERE
1 <> 2
AND it.product_code_v42 = 'M10BOLT'
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0;The performance of this query with a cold data cache is of interest, since read-ahead would be affected by table (clustered index) fragmentation. The clustering key for this table invites fragmentation, so it could be important to maintain (reorganize or rebuild) this index regularly, and use an appropriate
FILLFACTOR to allow space for new rows between index maintenance windows.I performed a test of the effect of fragmentation on read-ahead using sample data generated using SQL Data Generator. Using the same table row counts as shown in the question's query plan, a highly fragmented clustered index resulted in
SELECT * FROM view taking 15 seconds after DBCC DROPCLEANBUFFERS. The same test in the same conditions with a freshly-rebuilt clustered index on the ItemTrans table completed in 3 seconds.If the table data is typically entirely in cache the fragmentation issue is very much less important. But, even with low fragmentation, the wide table rows might mean the number of logical and physical reads is much higher than might be expected. You could also experiment with adding and removing the explicit
CONVERT to validate my expectation that the implicit conversion issue is not important here, except as a best practice violation.More to the point is the estimated number of rows leaving the seek operator. The optimization-time estimate is 165 rows, but 4,226 were produced at execution time. I will return to this point later, but the main reason for the discrepancy is that the selectivity of the residual predicate (involving the bitwise-AND) is very hard for the optimizer to predict - in fact it resorts to guessing.
Filter Operator
I am showing the filter predicate here mostly to illustrate how the two
NOT IN lists are combined, simplified and then expanded, and also to provide a reference for the following hash match discussion. The test query from the seek can be expanded to incorporate its effects and determine the effect of the Filter operator on performance:```
SELECT
it.booking_no_v32,
it.trans_type_v41,
it.Trans_qty,
it.Qt
Code Snippets
SELECT
it.booking_no_v32,
it.QtyCheckedOut,
it.QtyReturned,
it.Trans_qty,
it.trans_type_v41
FROM dbo.tblItemTran AS it
WHERE
1 <> 2
AND it.product_code_v42 = 'M10BOLT'
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0;SELECT
it.booking_no_v32,
it.trans_type_v41,
it.Trans_qty,
it.QtyReturned,
it.QtyCheckedOut
FROM dbo.tblItemTran AS it
WHERE
it.product_code_v42 = 'M10BOLT'
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
AND
(
(
it.trans_type_v41 NOT IN (2, 3, 6, 7, 18, 19, 20, 21, 12, 13, 22)
AND it.trans_type_v41 NOT IN (6, 7)
)
OR
(
it.trans_type_v41 NOT IN (6, 7)
)
OR
(
it.trans_type_v41 IN (6, 7)
AND it.QtyCheckedOut = 0
)
OR
(
it.trans_type_v41 IN (6, 7)
AND it.QtyCheckedOut > 0
AND it.trans_qty - (it.QtyCheckedOut - it.QtyReturned) > 0
)
);[Expr1016] = (trans_qty - (QtyCheckedOut - QtyReturned))ALTER TABLE dbo.tblItemTran
ADD Bit3 AS bit_field_v41 & CONVERT(tinyint, 4);
CREATE STATISTICS [stats dbo.ItemTran (product_code_v42, Bit3)]
ON dbo.tblItemTran (product_code_v42, Bit3);-- Indexed view to optimize the main view
CREATE VIEW dbo.V1
WITH SCHEMABINDING
AS
SELECT
it.ID,
it.product_code_v42,
it.trans_type_v41,
it.booking_no_v32,
it.Trans_qty,
it.QtyReturned,
it.QtyCheckedOut,
it.QtyReserved,
it.bit_field_v41,
it.prep_on,
it.From_locn,
it.Trans_to_locn,
it.PDate,
it.FirstDate,
it.PTimeH,
it.PTimeM,
it.RetnDate,
it.BookDate,
it.TimeBookedH,
it.TimeBookedM,
it.TimeBookedS,
it.del_time_hour,
it.del_time_min,
it.return_to_locn,
it.return_time_hour,
it.return_time_min,
it.AssignTo,
it.AssignType,
it.InRack
FROM dbo.tblItemTran AS it
JOIN dbo.tblBookings AS tb ON
tb.booking_no = it.booking_no_v32
WHERE
(
it.trans_type_v41 NOT IN (2, 3, 7, 18, 19, 20, 21, 12, 13, 22)
AND it.trans_type_v41 NOT IN (6, 7)
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
)
OR
(
it.trans_type_v41 NOT IN (6, 7)
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
AND tb.BookingProgressStatus = 1
)
OR
(
it.trans_type_v41 IN (6, 7)
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
AND it.QtyCheckedOut = 0
)
OR
(
it.trans_type_v41 IN (6, 7)
AND it.bit_field_v41 & CONVERT(tinyint, 4) = 0
AND it.QtyCheckedOut > 0
AND it.trans_qty - (it.QtyCheckedOut - it.QtyReturned) > 0
);
GO
CREATE UNIQUE CLUSTERED INDEX cuq ON dbo.V1 (product_code_v42, ID);
GOContext
StackExchange Database Administrators Q#28686, answer score: 52
Revisions (0)
No revisions yet.