HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Performance Improvement went wrong in Production, worked fine in Test

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
fineproductionworkedwentperformancewrongtestimprovement

Problem

We have an application supported by vendor and there was a piece of code which was doing very heavy logical reads and time consuming inside a procedure, hence I proposed them to tweak the query a bit to have much lesser IO and time, it worked well in test in terms of data and performance however its failing in production and returning different data, we had to rollback the changes. Need your expert advise on this.

This was tested for data in Test environment for more than 3 months and we never had any data issue. Started failing sparingly in Production immediately after the deployment and was producing inconsistent data.

Existing Query:

SELECT @Ref= CAST(MAX(ISNULL(CAST(ref_clnt AS INT),0))+1 AS VARCHAR(10)) 
FROM table_name WITH(NOLOCK) 
WHERE s_mode='value'


Proposed Query:

SELECT @Ref = ref_clnt+1 FROM table_name WITH(NOLOCK) 
WHERE RefNo = (SELECT MAX(RefNo) FROM table_name WHERE s_mode = 'value')


The DDL of table is as below:

CREATE TABLE [dbo].[table_name](
    [RefNo] [dbo].[udt_RefNo] NOT NULL,
    [S_Mode] [varchar](10) NOT NULL,
    [ref_clnt] [varchar](50) NULL)
CONSTRAINT [PK_table_name] PRIMARY KEY CLUSTERED 
(
    [RefNo] ASC
)


Providing only those columns from definition which is used in the query.

Udt_RefNo is a user defined datatype as:

CREATE TYPE [dbo].[udt_RefNo] FROM [char](16) NOT NULL
GO


Version of SQL Server: Microsoft SQL Server 2014 (SP3) Copyright (c) Microsoft Corporation Enterprise Edition (64-bit).

Non-clustered index covering columns as shown below:

CREATE NONCLUSTERED INDEX [ncidx_table_name_1] ON [dbo].[table_name]
(
    [S_Mode] ASC,
    [S_Status] ASC
)
INCLUDE (   [ref_clnt])


Please find the query plans as requested:

-
Previous Query

-
Proposed Query

Comparison of Number of Reads after enabling statistics IO and time:

```
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

(1

Solution

Your rewrite does not have the same semantics as the original query, so different results are not surprising.

This is true without worrying about the NOLOCK hint, and even if it is (somehow) guaranteed that the highest ref_clnt value is associated with the row having the highest RefNo value. See this db<>fiddle example.

Precision is important when dealing with computers, so you need to think carefully about data types and edge cases. The maximum RefNo calculation will use string sorting semantics, so '999' sorts higher than '1000'. There are several other important differences between the queries. I'm not going to list all of them, but NULL handling is another example.

There are also numerous bugs in both versions of the code. The original will fail if any ref_clnt value of -1000000000 or lower is returned, because that will not fit in varchar(10). The sign makes the length 11.

The easiest way to safely improve the original version of the code is to add an index on a computed column:

ALTER TABLE dbo.table_name 
ADD ref_cc AS 
    ISNULL(CAST(ref_clnt AS integer), 0);

CREATE NONCLUSTERED INDEX i 
ON dbo.table_name (S_Mode, ref_cc);


db<>fiddle demo

The execution plan can then seek directly to the highest ref_clnt row (sorted as integer) for the given S_Mode value:

The vendor's original SQL may still be of debatable quality, but at least it will run faster and produce the same results.

The vendor's new suggestion:

SELECT MAX(ISNULL(ref_clnt,0))+1 FROM table_name WITH(NOLOCK) WHERE S_Mode='value'


...is still problematic, at least in theory, because ISNULL uses the data type of the first parameter, so the integer literal 0 is implicitly cast to varchar(50).


The value of check_expression is returned if it is not NULL; otherwise, replacement_value is returned after it is implicitly converted to the type of check_expression, if the types are different. replacement_value can be truncated if replacement_value is longer than check_expression.

The MAX still operates on a string, which may produce unexpected results. In any case the expression is still not seekable without a (different) computed column index.

Code Snippets

ALTER TABLE dbo.table_name 
ADD ref_cc AS 
    ISNULL(CAST(ref_clnt AS integer), 0);

CREATE NONCLUSTERED INDEX i 
ON dbo.table_name (S_Mode, ref_cc);
SELECT MAX(ISNULL(ref_clnt,0))+1 FROM table_name WITH(NOLOCK) WHERE S_Mode='value'

Context

StackExchange Database Administrators Q#232698, answer score: 7

Revisions (0)

No revisions yet.