patternsqlMinor
Is it safe to use Strings instead of SqlStrings for CLR function input parameters?
Viewed 0 times
clrsqlstringsfunctioninputinsteadforsafeparametersusestrings
Problem
I have a CLR scalar UDF implemented through C# code. I've noticed that using the
A primary difference between native common language runtime (CLR) data
types and SQL Server data types is that the former do not allow for
NULL values, while the latter provide full NULL semantics.
...
Streaming values in can be achieved via SqlChars for N[VAR]CHAR,
SqlBytes for [VAR]BINARY, and SqlXml.CreateReader() for XML...
...
When using SqlString (not string or even SqlChars) you can access the
CompareInfo, CultureInfo, LCID, and SqlCompareOptions properties...
I know that my input will never be NULL, I don't need to stream the values in, and I'll never check the collation properties. Could my case be an exception where it's better to use
If it matters, I'm using SQL Server's default collation. Here's part of my source code, with
String data type for input parameters significantly improves performance compared to the SqlString data type. In Stairway to SQLCLR Level 5: Development (Using .NET within SQL Server), Solomon Rutzky mentions the following reasons to prefer the SQL data types for strings:A primary difference between native common language runtime (CLR) data
types and SQL Server data types is that the former do not allow for
NULL values, while the latter provide full NULL semantics.
...
Streaming values in can be achieved via SqlChars for N[VAR]CHAR,
SqlBytes for [VAR]BINARY, and SqlXml.CreateReader() for XML...
...
When using SqlString (not string or even SqlChars) you can access the
CompareInfo, CultureInfo, LCID, and SqlCompareOptions properties...
I know that my input will never be NULL, I don't need to stream the values in, and I'll never check the collation properties. Could my case be an exception where it's better to use
String instead of SqlString? If I do go with that approach, is there anything in particular that I should watch out for?If it matters, I'm using SQL Server's default collation. Here's part of my source code, with
s1 being the input parameter:fixed (char* chptr = s1)
{
char* cp = (char*)current;
for (int i = 0; i < s1.Length; i++)
{
cp[i] = chptr[i];
}
}Solution
Excellent question. As far as I am aware of, under those conditions (i.e. guaranteed no
So, I advise folks to "always" use the
With regards to collation and your statement of:
If it matters, I'm using SQL Server's default collation.
While it generally does not matter, it's also a bit unclear what you mean here given that there is no true default collation. Chances are that you are referring to the unfortunate default collation when installing SQL Server on an OS with a language setting of "US English" (i.e. LCID = 1033), which is
The reason I mention all of this is that there are some non-obvious things going on here:
-
to some degree, none of those 3 levels that collations affect is relevant since the default culture of SQLCLR threads is the Language setting at the OS level (the LCID of the selected language). This impacts operations using
-
to some degree, none of those 3 levels that collations affect is relevant since the
-
one instance where the SQL Server collation matters is when concatenating a
This, however, should be a non-issue since nobody should ever use the
That being said, I am curious as to what test you did to determine that
Also, regarding your test code: is
NULLs and no need for the extra functionality) there shouldn't be any specific concerns. This could be a situation similar to CURSORs where, if a generic rule is needed, it would be: "don't use cursors". But, the actual rule is: "only use cursors when/where appropriate". The problem is educating people on the technical details of cursors such that they can make that decision, which is those of us who know enough about such things ignore the generic rule and proceed to use them appropriately.So, I advise folks to "always" use the
Sql* types because it cuts down on confusion and errors. But, that is not to say that using string in your situation wouldn't be better. I say go for it, and if you ever run into a problem with string, it's easy enough to go back and change it to be SqlString.With regards to collation and your statement of:
If it matters, I'm using SQL Server's default collation.
While it generally does not matter, it's also a bit unclear what you mean here given that there is no true default collation. Chances are that you are referring to the unfortunate default collation when installing SQL Server on an OS with a language setting of "US English" (i.e. LCID = 1033), which is
SQL_Latin1_General_CP1_CI_AS. But then there are still three levels of collations that can all be different (Instance / Server, Database, and Column), and you might be meaning only one or even two of these levels.The reason I mention all of this is that there are some non-obvious things going on here:
-
to some degree, none of those 3 levels that collations affect is relevant since the default culture of SQLCLR threads is the Language setting at the OS level (the LCID of the selected language). This impacts operations using
String.Equals when using either of the two StringComparison.CurrentCulture* values, and operations using String.Compare when not specifying a culture.-
to some degree, none of those 3 levels that collations affect is relevant since the
= operator does an ordinal comparison (i.e. should be the same as using a _BIN2 collation). This is also how String.CompareOrdinal works, as well as String.Equals when not passing in StringComparison.CurrentCulture or StringComparison.InvariantCulture values.-
one instance where the SQL Server collation matters is when concatenating a
SqlString input parameter with a string via +. In this case the + operator creates a new SqlString to contain the value of the string so that it can then concatenate the two SqlStrings. The problem is that the new SqlString is created with the current threads LCID (which is the Operating System's LCID), and then the + operator compares the two SqlStringss prior to concatenation (i.e. verifies that they are the "same type"). But, due to the SqlString input parameter having the LCID of the database (not instance or column) and the implicitly created SqlString having an LCID of the OS, the operation gets an exception stating that the "collations" do not match. Nice, eh?This, however, should be a non-issue since nobody should ever use the
SqlString value directly when wanting the string. Everyone should instead always use the Value property to get the string.That being said, I am curious as to what test you did to determine that
string was faster. I tested a simple UDF that accepts a single NVARCHAR(4000) input parameter, concatenates a short string and then returns the new value. One version of that UDF accepts and returns string, and the other version accepts and returns SqlString. Over 1 million iterations, the string version was about 200-300 milliseconds faster than the SqlString version, about 50% of the time, when comparing their fastest times (over all 1 million iterations, not per each). The other 50% of the time the performance gain was around 100 milliseconds, but could also be none.Also, regarding your test code: is
s1 always the direct input parameter, whether it's string or SqlString? If yes, then you should also test creating a string locally and setting it s1.Value. Meaning:string s2 = s1.Value; // when s1 is SqlString instead of string
fixed (char* chptr = s2)
{
char cp = (char)current;
for (int i = 0; i
Also, some other options to possibly test:
- SqlString.GetUnicodeBytes method (returns
byte[])
- SqlChars.Value property (returns
char[]`)Context
StackExchange Database Administrators Q#238518, answer score: 8
Revisions (0)
No revisions yet.