HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Apparently, my CLR assembly function is causing deadlocks?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
apparentlyfunctionclrassemblycausingdeadlocks

Problem

Our application needs to work equally well with an Oracle database or a Microsoft SQL Server database. To facilitate this, we created a handful of UDFs to homogenize our query syntax. For example, SQL Server has GETDATE() and Oracle has SYSDATE. They perform the same function but they are different words. We wrote a wrapper UDF called NOW() for both platforms which wraps the relevant platform specific syntax in a common function name. We have other such functions, some of which do essentially nothing but exist solely for the sake of homogenization. Unfortunately, this has a cost for SQL Server. Inline scalar UDFs wreak havoc on performance and completely disable parallelism. As an alternative, we wrote CLR assembly functions to accomplish the same goals. When we deployed this to a client they started experiencing frequent deadlocks. This particular client is using replication and high availability techniques and I'm wondering if there's some sort of interaction going on here. I just don't understand how introducing a CLR function would cause problems like this. For reference, I've included the original scalar UDF definition as well as the replacement CLR definition in C# and the SQL declaration for it. I also have deadlock XML that I can provide if that helps.

Original UDF

CREATE FUNCTION [fn].[APAD]
(
    @Value VARCHAR(4000)
    , @tablename VARCHAR(4000) = NULL
    , @columnname VARCHAR(4000) = NULL
)

RETURNS VARCHAR(4000)
WITH SCHEMABINDING
AS

BEGIN
    RETURN LTRIM(RTRIM(@Value))
END
GO


CLR Assembly Function

[SqlFunction(IsDeterministic = true)]
public static string APAD(string value, string tableName, string columnName)
{
    return value?.Trim();
}


SQL Server Declaration for CLR Function

CREATE FUNCTION [fn].[APAD]
(
    @Value NVARCHAR(4000),
    @TableName NVARCHAR(4000),
    @ColumnName NVARCHAR(4000)
) RETURNS NVARCHAR(4000)
AS
EXTERNAL NAME ASI.fn.APAD
GO

Solution

What version(s) of SQL Server are you using?

I do recall seeing a slight change in behavior in SQL Server 2017 not too long ago. I will have to go back and see if I can find where I made a note of it, but I think it had to do with a schema lock being initiated when a SQLCLR object was being accessed.

While I am looking for that, I will say the following regarding your approach:

  • Please use the Sql types for input parameters, return types. You should be using SqlString instead of string. SqlString is very similar to a nullable string (your value?, but it has other functionality built in that is SQL Server-specific. All of the Sql types have a Value property that returns the expected .NET type (e.g. SqlString.Value returns string, SqlInt32 returns int, SqlDateTime returns DateTime, etc).



-
I would recommend against this entire approach to begin with, whether or not the deadlocks are related. I say this because:

  • Even with deterministic SQLCLR UDFs being able to participate in parallel plans, you are most likely going to get performance hits for emulating simplistic built-in functions.



  • The SQLCLR API does not allow for VARCHAR. Are you ok with implicitly converting everything to NVARCHAR and then again back to VARCHAR for simple operations?



  • The SQLCLR API does not allow for overloading, so you might need multiple versions of functions that do allow for different signatures in T-SQL and/or PL/SQL.



  • Similar to not allowing for overloading, there is a big difference between NVARCHAR(4000) and NVARCHAR(MAX): the MAX type (having even a single one of them in the signature) make the SQLCLR call take twice as long as not having any MAX type in the signature (I believe this holds true for VARBINARY(MAX) vs VARBINARY(4000) as well). So, you need to decide between:



  • using only NVARCHAR(MAX) to have a simplified API, but take the performance hit when you are using 8000 bytes or less of string data, or



  • creating two variations for all / most / many string functions: one with MAX types, and one without (for when you are guaranteed to never go over 8000 bytes of string data in or out). This is the approach I chose to take for most of the functions in my SQL# library: there is a Trim() function which likely has one or more MAX types, and a Trim4k() version which never has a MAX type anywhere in the signature or result set schema. The "4k" versions are absolutely more efficient.



-
You are not being careful to emulate functionality given the example in the question. LTRIM and RTRIM only trim spaces, while .NET String.Trim() trims white space (at least space, tabs, and newlines). For example:

PRINT LTRIM(RTRIM(N'      a       '));


  • Also, I just noticed that your function, both in T-SQL and in C#, only uses 1 of the 3 input parameters. Is this just a proof of concept, or redacted code?

Code Snippets

PRINT LTRIM(RTRIM(N'      a       '));

Context

StackExchange Database Administrators Q#236794, answer score: 7

Revisions (0)

No revisions yet.